From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mout02.posteo.de (mout02.posteo.de [185.67.36.66]) by mx.groups.io with SMTP id smtpd.web12.623.1618678271377655994 for ; Sat, 17 Apr 2021 09:51:12 -0700 Authentication-Results: mx.groups.io; dkim=fail reason="body hash did not verify" header.i=@posteo.de header.s=2017 header.b=J2mbSa5J; spf=pass (domain: posteo.de, ip: 185.67.36.66, mailfrom: mhaeuser@posteo.de) Received: from submission (posteo.de [89.146.220.130]) by mout02.posteo.de (Postfix) with ESMTPS id 995E42400E5 for ; Sat, 17 Apr 2021 18:51:08 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.de; s=2017; t=1618678268; bh=2t1NFEP2gGa42QDm/8xO4uQo5ZptBev7sQlHzi2uLFk=; h=Subject:To:Cc:From:Date:From; b=J2mbSa5J2gITzfaX/yWVF/qUb/Qow5XpTW6EH24uMS3lrjqE6GATeq7ZsU7bDyRGn KVkAMKPQI/Pi376b7owha6/EY1E8fdjqLXMkH99dwLhoGPq2hpR8KWb92Hmex3OS5P ce0679zJLz7acJupf1gk73Sw/U9z2mcbx49CZs1Qr1MZewTbrcaRVukBg5svxsgVUB 5XcEMa89v8Ar2ntCJ1vw4TELBdQDJTHFAPz4+DR0X5uXroDJ7Xc+x+4ukZQEunBTjo EIfm+bKsx5rqZcqFsVJaXRUoDjcicizVoNxV14xrj6Go3WuN4Rh+SjpVy0WhvMW0y7 aGsSI8/KQFcxw== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4FMzcB510bz6tmX; Sat, 17 Apr 2021 18:51:06 +0200 (CEST) Subject: Re: [edk2-devel] VirtIO Sound Driver (GSoC 2021) To: devel@edk2.groups.io, harlydavidsen@gmail.com, Andrew Fish Cc: Leif Lindholm , Michael Brown , Mike Kinney , Laszlo Ersek , "Desimone, Nathaniel L" , Rafael Rodrigues Machado , Gerd Hoffmann References: <4AEC1784-99AF-47EF-B7DD-77F91EA3D7E9@apple.com> <309cc5ca-2ecd-79dd-b183-eec0572ea982@ipxe.org> <33e37977-2d27-36a0-89a6-36e513d06b2f@ipxe.org> <6F69BEA6-5B7A-42E5-B6DA-D819ECC85EE5@apple.com> <20210416113447.GG1664@vanye> <10E3436C-D743-4B2F-8E4B-7AD93B82FC92@apple.com> From: =?UTF-8?B?TWFydmluIEjDpHVzZXI=?= Message-ID: Date: Sat, 17 Apr 2021 16:51:06 +0000 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable On 16.04.21 19:45, Ethin Probst wrote: > Yes, three APIs (maybe like this) would work well: > - Start, Stop: begin playback of a stream > - SetVolume, GetVolume, Mute, Unmute: control volume of output and enab= le muting > - CreateStream, ReleaseStream, SetStreamSampleRate: Control sample > rate of stream (but not sample format since Signed 16-bit PCM is > enough) > Marvin, how do you suggest we make the events then? We need some way > of notifying the caller that the stream has concluded. We could make > the driver create the event and pass it back to the caller as an > event, but you'd still have dangling pointers (this is C, after all). > We could just make a IsPlaying() function and WaitForCompletion() > function and allow the driver to do the event handling -- would that > work? I do not know enough about the possible use-cases to tell. Aside from=20 the two functions you already mentioned, you could also take in an=20 (optional) notification function. Which possible use-cases does determining playback end have? If it's too=20 much effort, just use EFI_EVENT I guess, just the less code can mess it=20 up, the better. If I remember correctly you mentioned the UEFI Talkbox before, if that=20 is more convenient for you, I'm there as mhaeuser. Best regards, Marvin > > On 4/16/21, Andrew Fish wrote: >> >>> On Apr 16, 2021, at 4:34 AM, Leif Lindholm wrote: >>> >>> Hi Ethin, >>> >>> I think we also want to have a SetMode function, even if we don't get >>> around to implement proper support for it as part of GSoC (although I >>> expect at least for virtio, that should be pretty straightforward). >>> >> Leif, >> >> I=E2=80=99m think if we have an API to load the buffer and a 2nd API t= o play the >> buffer an optional 3rd API could configure the streams. >> >>> It's quite likely that speech for UI would be stored as 8kHz (or >>> 20kHz) in some systems, whereas the example for playing a tune in GRU= B >>> would more likely be a 44.1 kHz mp3/wav/ogg/flac. >>> >>> For the GSoC project, I think it would be quite reasonable to >>> pre-generate pure PCM streams for testing rather than decoding >>> anything on the fly. >>> >>> Porting/writing decoders is really a separate task from enabling the >>> output. I would much rather see USB *and* HDA support able to play pc= m >>> streams before worrying about decoding. >>> >> I agree it might turn out it is easier to have the text to speech code= just >> encode a PCM directly. >> >> Thanks, >> >> Andrew Fish >> >>> / >>> Leif >>> >>> On Fri, Apr 16, 2021 at 00:33:06 -0500, Ethin Probst wrote: >>>> Thanks for that explanation (I missed Mike's message). Earlier I sen= t >>>> a summary of those things that we can agree on: mainly, that we have >>>> mute, volume control, a load buffer, (maybe) an unload buffer, and a >>>> start/stop stream function. Now that I fully understand the >>>> ramifications of this I don't mind settling for a specific format an= d >>>> sample rate, and signed 16-bit PCM audio is, I think, the most widel= y >>>> used one out there, besides 64-bit floating point samples, which I'v= e >>>> only seen used in DAWs, and that's something we don't need. >>>> Are you sure you want the firmware itself to handle the decoding of >>>> WAV audio? I can make a library class for that, but I'll definitely >>>> need help with the security aspect. >>>> >>>> On 4/16/21, Andrew Fish via groups.io = wrote: >>>>> >>>>>> On Apr 15, 2021, at 5:59 PM, Michael Brown wrote: >>>>>> >>>>>> On 16/04/2021 00:42, Ethin Probst wrote: >>>>>>> Forcing a particular channel mapping, sample rate and sample form= at >>>>>>> on >>>>>>> everyone would complicate application code. From an application p= oint >>>>>>> of view, one would, with that type of protocol, need to do the >>>>>>> following: >>>>>>> 1) Load an audio file in any audio file format from any storage >>>>>>> mechanism. >>>>>>> 2) Decode the audio file format to extract the samples and audio >>>>>>> metadata. >>>>>>> 3) Resample the (now decoded) audio samples and convert (quantize= ) >>>>>>> the >>>>>>> audio samples into signed 16-bit PCM audio. >>>>>>> 4) forward the samples onto the EFI audio protocol. >>>>>> You have made an incorrect assumption that there exists a requirem= ent >>>>>> to >>>>>> be able to play audio files in arbitrary formats. This requiremen= t >>>>>> does >>>>>> not exist. >>>>>> >>>>>> With a protocol-mandated fixed baseline set of audio parameters >>>>>> (sample >>>>>> rate etc), what would happen in practice is that the audio files w= ould >>>>>> be >>>>>> encoded in that format at *build* time, using tools entirely exter= nal >>>>>> to >>>>>> UEFI. The application code is then trivially simple: it just does >>>>>> "load >>>>>> blob, pass blob to audio protocol". >>>>>> >>>>> >>>>> Ethin, >>>>> >>>>> Given the goal is an industry standard we value interoperability mo= re >>>>> that >>>>> flexibility. >>>>> >>>>> How about another use case. Lets say the Linux OS loader (Grub) wan= ts >>>>> to >>>>> have an accessible UI so it decides to sore sound files on the EFI >>>>> System >>>>> Partition and use our new fancy UEFI Audio Protocol to add audio to= the >>>>> OS >>>>> loader GUI. So that version of Grub needs to work on 1,000 of diffe= rent >>>>> PCs >>>>> and a wide range of UEFI Audio driver implementations. It is a much >>>>> easier >>>>> world if Wave PCM 16 bit just works every place. You could add a lo= t of >>>>> complexity and try to encode the audio on the fly, maybe even in Li= nux >>>>> proper but that falls down if you are booting from read only media = like >>>>> a >>>>> DVD or backup tape (yes people still do that in server land). >>>>> >>>>> The other problem with flexibility is you just made the test matrix >>>>> very >>>>> large for every driver that needs to get implemented. For something= as >>>>> complex as Intel HDA how you hook up the hardware and what CODECs y= ou >>>>> use >>>>> may impact the quality of the playback for a given board. Your EFI = is >>>>> likely >>>>> going to pick a single encoding at that will get tested all the tim= e if >>>>> your >>>>> system has audio, but all 50 other things you support not so much. = So >>>>> that >>>>> will required testing, and some one with audiophile ears (or an AI >>>>> program) >>>>> to test all the combinations. I=E2=80=99m not kidding I get BZs on = the quality >>>>> of >>>>> the boot bong on our systems. >>>>> >>>>> >>>>>>> typedef struct EFI_SIMPLE_AUDIO_PROTOCOL { >>>>>>> EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset; >>>>>>> EFI_SIMPLE_AUDIO_PROTOCOL_START Start; >>>>>>> EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop; >>>>>>> } EFI_SIMPLE_AUDIO_PROTOCOL; >>>>>> This is now starting to look like something that belongs in boot-t= ime >>>>>> firmware. :) >>>>>> >>>>> I think that got a little too simple I=E2=80=99d go back and look a= t the example >>>>> I >>>>> posted to the thread but add an API to load the buffer, and then pl= ay >>>>> the >>>>> buffer (that way we can an API in the future to twiddle knobs). Tha= t >>>>> API >>>>> also implements the async EFI interface. Trust me the 1st thing tha= t is >>>>> going to happen when we add audio is some one is going to complain = in >>>>> xyz >>>>> state we should mute audio, or we should honer audio volume and mut= e >>>>> settings from setup, or from values set in the OS. Or some one is g= oing >>>>> to >>>>> want the volume keys on the keyboard to work in EFI. >>>>> >>>>> Also if you need to pick apart the Wave PCM 16 byte file to feed it= into >>>>> the >>>>> audio hardware that probably means we should have a library that do= es >>>>> that >>>>> work, so other Audio drivers can share that code. Also having a lib= rary >>>>> makes it easier to write a unit test. We need to be security consci= ous >>>>> as we >>>>> need to treat the Audo file as attacker controlled data. >>>>> >>>>> Thanks, >>>>> >>>>> Andrew Fish >>>>> >>>>>> Michael >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> Signed, >>>> Ethin D. Probst >>> >>> >>> >>> >> >