From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from rn-mailsvcp-ppex-lapp44.apple.com (rn-mailsvcp-ppex-lapp44.apple.com [17.179.253.48]) by mx.groups.io with SMTP id smtpd.web10.16746.1618779638659227367 for ; Sun, 18 Apr 2021 14:00:38 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@apple.com header.s=20180706 header.b=lxztPsnp; spf=pass (domain: apple.com, ip: 17.179.253.48, mailfrom: afish@apple.com) Received: from pps.filterd (rn-mailsvcp-ppex-lapp44.rno.apple.com [127.0.0.1]) by rn-mailsvcp-ppex-lapp44.rno.apple.com (8.16.1.2/8.16.1.2) with SMTP id 13IKvV8l006854; Sun, 18 Apr 2021 14:00:38 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=from : message-id : content-type : mime-version : subject : date : in-reply-to : cc : to : references; s=20180706; bh=qxtwNw7QWhTZydfWKvXhvArsQacKWCCsNyZbCweII2E=; b=lxztPsnp99vka7PTxN+bfyeWIXnGd/d/H4dQP3zi+Q25vW761F5zBKjr8caaqEkiP/YX bd/K10MP4uULyrAiycFZ+q449KMDpJXE2QFLwdD6GrioUS6e6s7S2tXcbQHLIJ1LuWoD ZzkA1rV5s0D8zuY1Jbz7c75qvfCICjImecwdYFp+2G9fssfxrZz5D/lHsOcUZPIcfTuI xEPj8hH1z+zuOQKKaQcXNNz1/nsdNhRTSjLq+kz0K513Ohn8JRcoE6MutB01lVTc58Wf UsyAZ70ht19w6IxYdlQTiHXeD+z1IPUfCZnNBobzOotgn8wJkKk3IOFH2KQwcsEq4rkA hQ== Received: from rn-mailsvcp-mta-lapp04.rno.apple.com (rn-mailsvcp-mta-lapp04.rno.apple.com [10.225.203.152]) by rn-mailsvcp-ppex-lapp44.rno.apple.com with ESMTP id 37yu67dqff-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Sun, 18 Apr 2021 14:00:38 -0700 Received: from rn-mailsvcp-mmp-lapp03.rno.apple.com (rn-mailsvcp-mmp-lapp03.rno.apple.com [17.179.253.16]) by rn-mailsvcp-mta-lapp04.rno.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) with ESMTPS id <0QRS00QMD2D2CP00@rn-mailsvcp-mta-lapp04.rno.apple.com>; Sun, 18 Apr 2021 14:00:38 -0700 (PDT) Received: from process_milters-daemon.rn-mailsvcp-mmp-lapp03.rno.apple.com by rn-mailsvcp-mmp-lapp03.rno.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) id <0QRS00W002COST00@rn-mailsvcp-mmp-lapp03.rno.apple.com>; Sun, 18 Apr 2021 14:00:38 -0700 (PDT) X-Va-A: X-Va-T-CD: cc354bcf01ea39de908abab4e73c9ec0 X-Va-E-CD: 4730c80ee67030d4f2c83e40b4ab0357 X-Va-R-CD: 6f0325faf294bd23a6d751c620be9d51 X-Va-CD: 0 X-Va-ID: c5476d82-bbe4-4b24-9c43-b036327b08b8 X-V-A: X-V-T-CD: cc354bcf01ea39de908abab4e73c9ec0 X-V-E-CD: 4730c80ee67030d4f2c83e40b4ab0357 X-V-R-CD: 6f0325faf294bd23a6d751c620be9d51 X-V-CD: 0 X-V-ID: 6ac2534c-b844-4768-8fcf-3b25d31d5b1b X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.761 definitions=2021-04-18_12:2021-04-16,2021-04-18 signatures=0 Received: from [17.235.52.5] (unknown [17.235.52.5]) by rn-mailsvcp-mmp-lapp03.rno.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) with ESMTPSA id <0QRS00U132CZ7F00@rn-mailsvcp-mmp-lapp03.rno.apple.com>; Sun, 18 Apr 2021 14:00:37 -0700 (PDT) From: "Andrew Fish" Message-id: <537C3A1C-044A-49AD-86A1-F374DCA294E3@apple.com> MIME-version: 1.0 (Mac OS X Mail 14.0 \(3654.20.0.2.1\)) Subject: Re: [edk2-devel] VirtIO Sound Driver (GSoC 2021) Date: Sun, 18 Apr 2021 14:00:35 -0700 In-reply-to: <7f129be1-d8b9-bc4f-958f-21e5ac6cc3d9@posteo.de> Cc: Ethin Probst , Leif Lindholm , Michael Brown , Mike Kinney , Laszlo Ersek , "Desimone, Nathaniel L" , Rafael Rodrigues Machado , Gerd Hoffmann To: devel@edk2.groups.io, mhaeuser@posteo.de References: <4AEC1784-99AF-47EF-B7DD-77F91EA3D7E9@apple.com> <309cc5ca-2ecd-79dd-b183-eec0572ea982@ipxe.org> <33e37977-2d27-36a0-89a6-36e513d06b2f@ipxe.org> <6F69BEA6-5B7A-42E5-B6DA-D819ECC85EE5@apple.com> <20210416113447.GG1664@vanye> <10E3436C-D743-4B2F-8E4B-7AD93B82FC92@apple.com> <7459B8C0-EDF0-4760-97E7-D3338312B3DF@apple.com> <9b5f25d9-065b-257d-1d2d-7f80d14dec64@posteo.de> <6c0a4bf5-482e-b4f2-5df4-74930f4d979c@posteo.de> <7f129be1-d8b9-bc4f-958f-21e5ac6cc3d9@posteo.de> X-Mailer: Apple Mail (2.3654.20.0.2.1) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.761 definitions=2021-04-18_14:2021-04-16,2021-04-18 signatures=0 Content-type: multipart/alternative; boundary="Apple-Mail=_7356BD0C-473A-494E-9EE4-A226AB7F9F79" --Apple-Mail=_7356BD0C-473A-494E-9EE4-A226AB7F9F79 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Apr 18, 2021, at 12:22 PM, Marvin H=C3=A4user wr= ote: >=20 > On 18.04.21 21:11, Marvin H=C3=A4user wrote: >> On 18.04.21 17:22, Andrew Fish via groups.io wrote: >>>=20 >>>=20 >>>> On Apr 18, 2021, at 1:55 AM, Ethin Probst > wrote: >>>>=20 >>>>> I think it would be best to sketch use-cases for audio and design th= e solutions closely to the requirements. Why do we need to know when audio = finished? What will happen when we queue audio twice? There are many layers= (UX, interface, implementation details) of questions to coming up with a p= leasant and stable design. >>>=20 >>> We are not using EFI to listen to music in the background. Any audio b= eing played is part of a UI element and there might be synchronization requ= irements. >>=20 >> Maybe I communicated that wrong, I'm not asking because I don't know wh= at audio is used for, I am saying ideally there is a written-down list of u= sage requirements before the protocol is designed, because that is what the= design targets. The details should follow the needs. >>=20 >>>=20 >>> For example playing a boot bong on boot you want that to be asynchrono= us as you don=E2=80=99t want to delay boot to play the sound, but you may w= ant to chose to gate some UI elements on the boot bong completing. If you a= re building a menu UI that is accessible you may need to synchronize playba= ck with UI update, but you may not want to make the slow sound playback blo= cking as you can get other UI work done in parallel. >>>=20 >>> The overhead for a caller making an async call is not much [1], but no= t having the capability could really restrict the API for its intended use.= I=E2=80=99d also point out we picked the same pattern as the async BlockIO= and there is something to said for having consistency in the UEFI Spec and= have similar APIs work in similar ways. >=20 > Sorry a lot of the spam, but I somehow missed the "consistency" point. S= orry, but there seems to be no real consistency. Block I/O and network thin= gs generally use the token event method (what is suggested here), while USB= , Bluetooth, and such generally pass a callback function directly (not nece= ssarily what I suggest, as I don't know the full requirements, but certainl= y one way). >=20 The networking interfaces are ancient, and it looks like the recent BT sta= ck leans into that model. Also those callbacks are about returning data in = the form of unknown packets which is not the problem we are trying to solve= .=20 The async Block IO is more of a model of notification of a queued event an= d I think that maps better into what we are doing. The MP Services protocol= from the PI spec also uses an optional event to notify completion. At some= point we realized with a callback, or just and event it was hard to return= an error status so that is why we ended up with the token in Block IO 2 so= a states could be returned after the completion event was signaled.=20 There is also more flexibility with events as it lets define a GUID=E2=80= =99ed event and broadcast this state to other entities if you want.=20 Thanks, Andrew Fish > Best regards, > Marvin >=20 >>=20 >> I'm not saying there should be *no* async playback, I am saying it may = be worth considering implementing it differently from caller-owned events. = I'm not concerned with overhead, I'm concerned with points of failure (e.g.= leaks). >>=20 >> I very briefly discussed some things with Ethin and it seems like the d= efault EDK II timer interval of 10 ms may be problematic, but I am not sure= . Just leaving it here as something to keep it mind. >>=20 >> Best regards, >> Marvin >>=20 >>>=20 >>> [1] Overhead for making an asynchronous call. >>> AUDIO_TOKEN AudioToken; >>> gBS->CreateEvent (EVT_NOTIFY_SIGNAL, TPL_CALLBACK, NULL, NULL, &Audio= Token.Event); >>>=20 >>> Thanks, >>>=20 >>> Andrew Fish >>>=20 >>>> I would be happy to discuss this with you on the UEFI talkbox. I'm >>>> draeand on there. >>>> As for your questions: >>>>=20 >>>> 1. The only reason I recommend using an event to signal audio >>>> completion is because I do not want this protocol to be blocking at >>>> all. (So, perhaps removing the token entirely is a good idea.) The >>>> VirtIO audio device says nothing about synchronization, but I imagine >>>> its asynchronous because every audio specification I've seen out ther= e >>>> is asynchronous. Similarly, every audio API in existence -- at least, >>>> every low-level OS-specific one -- is asynchronous/non-blocking. >>>> (Usually, audio processing is handled on a separate thread.) However, >>>> UEFI has no concept of threads or processes. Though we could use the >>>> MP PI package to spin up a separate processor, that would fail on >>>> uniprocessor, unicore systems. Audio processing needs a high enough >>>> priority that it gets first in the list of tasks served while >>>> simultaneously not getting a priority that's so high that it blocks >>>> everything else. This is primarily because of the way an audio >>>> subsystem is designed and the way an audio device functions: the audi= o >>>> subsystem needs to know, immediately, when the audio buffer has ran >>>> out of samples and needs more, and it needs to react immediately to >>>> refill the buffer if required, especially when streaming large amount= s >>>> of audio (e.g.: music). Similarly, the audio subsystem needs the >>>> ability to react as soon as is viable when playback is requested, >>>> because any significant delay will be noticeable by the end-user. In >>>> more complex systems like FMOD or OpenAL, the audio processing thread >>>> also needs a high priority to ensure that audio effects, positioning >>>> information, dithering, etc., can be configured immediately because >>>> the user will notice if any glitches or delays occur. The UEFI audio >>>> protocols obviously will be nowhere near as complex, or as advanced, >>>> because no one will need audio effects in a preboot environment. >>>> Granted, its possible to make small audio effects, for example delays= , >>>> even if the protocol doesn't have functions to do that, but if an >>>> end-user wants to go absolutely crazy with the audio samples and mix >>>> in a really nice-sounding reverb or audio filter before sending the >>>> samples to the audio engine, well, that's what they want to do and >>>> that's out of our hands as driver/protocol developers. But I digress. >>>> UEFI only has four TPLs, and so what we hopefully want is an engine >>>> that is able to manage sample buffering and transmission, but also >>>> doesn't block the application that's using the protocol. For some >>>> things, blocking might be acceptable, but for speech synthesis or the >>>> playing of startup sounds, this would not be an acceptable result and >>>> would make the protocol pretty much worthless in the majority of >>>> scenarios. So that's why I had an event to signal audio completion -- >>>> it was (perhaps) a cheap hack around the cooperatively-scheduled task >>>> architecture of UEFI. (At least, I think its cooperative multitasking= , >>>> correct me if I'm wrong.) >>>> 2. The VirtIO specification does not specify what occurs in the event >>>> that a request is received to play a stream that's already being >>>> played. However, it does provide enough information for extrapolation= . >>>> Every request that's sent to a VirtIO sound device must come with two >>>> things: a stream ID and a buffer of samples. The sample data must >>>> immediately follow the request. Therefore, for VirtIO in particular, >>>> the device will simply stop playing the old set of samples and play >>>> the new set instead. This goes along with what I've seen in other >>>> specifications like the HDA one: unless the device in question >>>> supports more than one stream, it is impossible to play two sounds on >>>> a single stream simultaneously, and an HDA controller (for example) i= s >>>> not going to perform any mixing; mixing is done purely in software. >>>> Similarly, if a device does support multiple streams, it is >>>> unspecified whether the device will play two or more streams >>>> simultaneously or whether it will pause/abort the playback of one >>>> while it plays another. Therefore, I believe (though cannot confirm) >>>> that OSes like Windows simply use a single stream, even if the device >>>> supports multiple streams, and just makes the applications believe >>>> that unlimited streams are possible. >>>>=20 >>>> I apologize for this really long-winded email, and I hope no one mind= s. :-) >>>>=20 >>>> On 4/17/21, Marvin H=C3=A4user > wrote: >>>>> On 17.04.21 19:31, Andrew Fish via groups.io wrot= e: >>>>>>=20 >>>>>>=20 >>>>>>> On Apr 17, 2021, at 9:51 AM, Marvin H=C3=A4user >>>>>>> >> wrote: >>>>>>>=20 >>>>>>> On 16.04.21 19:45, Ethin Probst wrote: >>>>>>>> Yes, three APIs (maybe like this) would work well: >>>>>>>> - Start, Stop: begin playback of a stream >>>>>>>> - SetVolume, GetVolume, Mute, Unmute: control volume of output an= d >>>>>>>> enable muting >>>>>>>> - CreateStream, ReleaseStream, SetStreamSampleRate: Control sampl= e >>>>>>>> rate of stream (but not sample format since Signed 16-bit PCM is >>>>>>>> enough) >>>>>>>> Marvin, how do you suggest we make the events then? We need some = way >>>>>>>> of notifying the caller that the stream has concluded. We could m= ake >>>>>>>> the driver create the event and pass it back to the caller as an >>>>>>>> event, but you'd still have dangling pointers (this is C, after a= ll). >>>>>>>> We could just make a IsPlaying() function and WaitForCompletion() >>>>>>>> function and allow the driver to do the event handling -- would t= hat >>>>>>>> work? >>>>>>>=20 >>>>>>> I do not know enough about the possible use-cases to tell. Aside f= rom >>>>>>> the two functions you already mentioned, you could also take in an >>>>>>> (optional) notification function. >>>>>>> Which possible use-cases does determining playback end have? If it= 's >>>>>>> too much effort, just use EFI_EVENT I guess, just the less code ca= n >>>>>>> mess it up, the better. >>>>>>>=20 >>>>>>=20 >>>>>> In UEFI EFI_EVENT works much better. There is a gBS-WaitForEvent() >>>>>> function that lets a caller wait on an event. That is basically wha= t >>>>>> the UEFI Shell is doing at the Shell prompt. A GUI in UEFI/C is >>>>>> basically an event loop. >>>>>>=20 >>>>>> Fun fact: I ended up adding gIdleLoopEventGuid to the MdeModulePkg = so >>>>>> the DXE Core could signal gIdleLoopEventGuid if you are sitting in >>>>>> gBS-WaitForEvent() and no event is signaled. Basically in EFI nothi= ng >>>>>> is going to happen until the next timer tick so the gIdleLoopEventG= uid >>>>>> lets you idle the CPU until the next timer tick. I was forced to do >>>>>> this as the 1st MacBook Air had a bad habit of thermal tripping whe= n >>>>>> sitting at the UEFI Shell prompt. After all another name for a loop= in >>>>>> C code running on bare metal is a power virus. >>>>>=20 >>>>> Mac EFI is one of the best implementations we know of, frankly. I'm >>>>> traumatised by Aptio 4 and alike, where (some issues are OEM-specifi= c I >>>>> think) you can have timer events signalling after ExitBS, there is e= vent >>>>> clutter on IO polling to the point where everything lags no matter w= hat >>>>> you do, and even in "smooth" scenarios there may be nothing worth th= e >>>>> description "granularity" (events scheduled to run every 10 ms may r= un >>>>> every 50 ms). Events are the last resort for us, if there really is = no >>>>> other way. My first GUI implementation worked without events at all = for >>>>> this reason, but as our workarounds got better, we did start using t= hem >>>>> for keyboard and mouse polling. >>>>>=20 >>>>> Timers do not apply here, but what does apply is resource management= . >>>>> Using EFI_EVENT directly means (to the outside) the introduction of = a >>>>> new resource to maintain, for each caller separately. On the other s= ide, >>>>> there is no resource to misuse or leak if none such is exposed. Yet,= if >>>>> you argue with APIs like WaitForEvent, something has to signal it. I= n a >>>>> simple environment this would mean, some timer event is running and = may >>>>> signal the event the main code waits for, where above's concern actu= ally >>>>> do apply. :) Again, the recommendation assumes the use-cases are sim= ple >>>>> enough to easily avoid them. >>>>>=20 >>>>> I think it would be best to sketch use-cases for audio and design th= e >>>>> solutions closely to the requirements. Why do we need to know when a= udio >>>>> finished? What will happen when we queue audio twice? There are many >>>>> layers (UX, interface, implementation details) of questions to comin= g up >>>>> with a pleasant and stable design. >>>>>=20 >>>>> Best regards, >>>>> Marvin >>>>>=20 >>>>>>=20 >>>>>> Thanks, >>>>>>=20 >>>>>> Andrew Fish. >>>>>>=20 >>>>>>> If I remember correctly you mentioned the UEFI Talkbox before, if >>>>>>> that is more convenient for you, I'm there as mhaeuser. >>>>>>>=20 >>>>>>> Best regards, >>>>>>> Marvin >>>>>>>=20 >>>>>>>>=20 >>>>>>>> On 4/16/21, Andrew Fish = >> >>>>>>>> wrote: >>>>>>>>>=20 >>>>>>>>>> On Apr 16, 2021, at 4:34 AM, Leif Lindholm >>>>>>>>>> >> wrote: >>>>>>>>>>=20 >>>>>>>>>> Hi Ethin, >>>>>>>>>>=20 >>>>>>>>>> I think we also want to have a SetMode function, even if we don= 't get >>>>>>>>>> around to implement proper support for it as part of GSoC (alth= ough I >>>>>>>>>> expect at least for virtio, that should be pretty straightforwa= rd). >>>>>>>>>>=20 >>>>>>>>> Leif, >>>>>>>>>=20 >>>>>>>>> I=E2=80=99m think if we have an API to load the buffer and a 2nd= API to >>>>>>>>> play the >>>>>>>>> buffer an optional 3rd API could configure the streams. >>>>>>>>>=20 >>>>>>>>>> It's quite likely that speech for UI would be stored as 8kHz (o= r >>>>>>>>>> 20kHz) in some systems, whereas the example for playing a tune = in >>>>>>>>>> GRUB >>>>>>>>>> would more likely be a 44.1 kHz mp3/wav/ogg/flac. >>>>>>>>>>=20 >>>>>>>>>> For the GSoC project, I think it would be quite reasonable to >>>>>>>>>> pre-generate pure PCM streams for testing rather than decoding >>>>>>>>>> anything on the fly. >>>>>>>>>>=20 >>>>>>>>>> Porting/writing decoders is really a separate task from enablin= g the >>>>>>>>>> output. I would much rather see USB *and* HDA support able to p= lay >>>>>>>>>> pcm >>>>>>>>>> streams before worrying about decoding. >>>>>>>>>>=20 >>>>>>>>> I agree it might turn out it is easier to have the text to speec= h >>>>>>>>> code just >>>>>>>>> encode a PCM directly. >>>>>>>>>=20 >>>>>>>>> Thanks, >>>>>>>>>=20 >>>>>>>>> Andrew Fish >>>>>>>>>=20 >>>>>>>>>> / >>>>>>>>>> Leif >>>>>>>>>>=20 >>>>>>>>>> On Fri, Apr 16, 2021 at 00:33:06 -0500, Ethin Probst wrote: >>>>>>>>>>> Thanks for that explanation (I missed Mike's message). Earlier= I >>>>>>>>>>> sent >>>>>>>>>>> a summary of those things that we can agree on: mainly, that w= e have >>>>>>>>>>> mute, volume control, a load buffer, (maybe) an unload buffer,= and a >>>>>>>>>>> start/stop stream function. Now that I fully understand the >>>>>>>>>>> ramifications of this I don't mind settling for a specific for= mat >>>>>>>>>>> and >>>>>>>>>>> sample rate, and signed 16-bit PCM audio is, I think, the most >>>>>>>>>>> widely >>>>>>>>>>> used one out there, besides 64-bit floating point samples, whi= ch >>>>>>>>>>> I've >>>>>>>>>>> only seen used in DAWs, and that's something we don't need. >>>>>>>>>>> Are you sure you want the firmware itself to handle the decodi= ng of >>>>>>>>>>> WAV audio? I can make a library class for that, but I'll defin= itely >>>>>>>>>>> need help with the security aspect. >>>>>>>>>>>=20 >>>>>>>>>>> On 4/16/21, Andrew Fish via groups.io > >>>>>>>>>>> >> >>>>>>>>>>> wrote: >>>>>>>>>>>>=20 >>>>>>>>>>>>> On Apr 15, 2021, at 5:59 PM, Michael Brown >>>>>>>>>>>>> >> wrote: >>>>>>>>>>>>>=20 >>>>>>>>>>>>> On 16/04/2021 00:42, Ethin Probst wrote: >>>>>>>>>>>>>> Forcing a particular channel mapping, sample rate and sampl= e >>>>>>>>>>>>>> format >>>>>>>>>>>>>> on >>>>>>>>>>>>>> everyone would complicate application code. From an >>>>>>>>>>>>>> application point >>>>>>>>>>>>>> of view, one would, with that type of protocol, need to do = the >>>>>>>>>>>>>> following: >>>>>>>>>>>>>> 1) Load an audio file in any audio file format from any sto= rage >>>>>>>>>>>>>> mechanism. >>>>>>>>>>>>>> 2) Decode the audio file format to extract the samples and = audio >>>>>>>>>>>>>> metadata. >>>>>>>>>>>>>> 3) Resample the (now decoded) audio samples and convert >>>>>>>>>>>>>> (quantize) >>>>>>>>>>>>>> the >>>>>>>>>>>>>> audio samples into signed 16-bit PCM audio. >>>>>>>>>>>>>> 4) forward the samples onto the EFI audio protocol. >>>>>>>>>>>>> You have made an incorrect assumption that there exists a >>>>>>>>>>>>> requirement >>>>>>>>>>>>> to >>>>>>>>>>>>> be able to play audio files in arbitrary formats. This >>>>>>>>>>>>> requirement >>>>>>>>>>>>> does >>>>>>>>>>>>> not exist. >>>>>>>>>>>>>=20 >>>>>>>>>>>>> With a protocol-mandated fixed baseline set of audio paramet= ers >>>>>>>>>>>>> (sample >>>>>>>>>>>>> rate etc), what would happen in practice is that the audio >>>>>>>>>>>>> files would >>>>>>>>>>>>> be >>>>>>>>>>>>> encoded in that format at *build* time, using tools entirely >>>>>>>>>>>>> external >>>>>>>>>>>>> to >>>>>>>>>>>>> UEFI. The application code is then trivially simple: it jus= t does >>>>>>>>>>>>> "load >>>>>>>>>>>>> blob, pass blob to audio protocol". >>>>>>>>>>>>>=20 >>>>>>>>>>>>=20 >>>>>>>>>>>> Ethin, >>>>>>>>>>>>=20 >>>>>>>>>>>> Given the goal is an industry standard we value interoperabil= ity >>>>>>>>>>>> more >>>>>>>>>>>> that >>>>>>>>>>>> flexibility. >>>>>>>>>>>>=20 >>>>>>>>>>>> How about another use case. Lets say the Linux OS loader (Gru= b) >>>>>>>>>>>> wants >>>>>>>>>>>> to >>>>>>>>>>>> have an accessible UI so it decides to sore sound files on th= e EFI >>>>>>>>>>>> System >>>>>>>>>>>> Partition and use our new fancy UEFI Audio Protocol to add au= dio >>>>>>>>>>>> to the >>>>>>>>>>>> OS >>>>>>>>>>>> loader GUI. So that version of Grub needs to work on 1,000 of >>>>>>>>>>>> different >>>>>>>>>>>> PCs >>>>>>>>>>>> and a wide range of UEFI Audio driver implementations. It is = a much >>>>>>>>>>>> easier >>>>>>>>>>>> world if Wave PCM 16 bit just works every place. You could ad= d a >>>>>>>>>>>> lot of >>>>>>>>>>>> complexity and try to encode the audio on the fly, maybe even= in >>>>>>>>>>>> Linux >>>>>>>>>>>> proper but that falls down if you are booting from read only >>>>>>>>>>>> media like >>>>>>>>>>>> a >>>>>>>>>>>> DVD or backup tape (yes people still do that in server land). >>>>>>>>>>>>=20 >>>>>>>>>>>> The other problem with flexibility is you just made the test = matrix >>>>>>>>>>>> very >>>>>>>>>>>> large for every driver that needs to get implemented. For >>>>>>>>>>>> something as >>>>>>>>>>>> complex as Intel HDA how you hook up the hardware and what >>>>>>>>>>>> CODECs you >>>>>>>>>>>> use >>>>>>>>>>>> may impact the quality of the playback for a given board. You= r >>>>>>>>>>>> EFI is >>>>>>>>>>>> likely >>>>>>>>>>>> going to pick a single encoding at that will get tested all t= he >>>>>>>>>>>> time if >>>>>>>>>>>> your >>>>>>>>>>>> system has audio, but all 50 other things you support not so >>>>>>>>>>>> much. So >>>>>>>>>>>> that >>>>>>>>>>>> will required testing, and some one with audiophile ears (or = an AI >>>>>>>>>>>> program) >>>>>>>>>>>> to test all the combinations. I=E2=80=99m not kidding I get B= Zs on the >>>>>>>>>>>> quality >>>>>>>>>>>> of >>>>>>>>>>>> the boot bong on our systems. >>>>>>>>>>>>=20 >>>>>>>>>>>>=20 >>>>>>>>>>>>>> typedef struct EFI_SIMPLE_AUDIO_PROTOCOL { >>>>>>>>>>>>>> EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset; >>>>>>>>>>>>>> EFI_SIMPLE_AUDIO_PROTOCOL_START Start; >>>>>>>>>>>>>> EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop; >>>>>>>>>>>>>> } EFI_SIMPLE_AUDIO_PROTOCOL; >>>>>>>>>>>>> This is now starting to look like something that belongs in >>>>>>>>>>>>> boot-time >>>>>>>>>>>>> firmware. :) >>>>>>>>>>>>>=20 >>>>>>>>>>>> I think that got a little too simple I=E2=80=99d go back and = look at the >>>>>>>>>>>> example >>>>>>>>>>>> I >>>>>>>>>>>> posted to the thread but add an API to load the buffer, and t= hen >>>>>>>>>>>> play >>>>>>>>>>>> the >>>>>>>>>>>> buffer (that way we can an API in the future to twiddle knobs= ). >>>>>>>>>>>> That >>>>>>>>>>>> API >>>>>>>>>>>> also implements the async EFI interface. Trust me the 1st thi= ng >>>>>>>>>>>> that is >>>>>>>>>>>> going to happen when we add audio is some one is going to >>>>>>>>>>>> complain in >>>>>>>>>>>> xyz >>>>>>>>>>>> state we should mute audio, or we should honer audio volume a= nd >>>>>>>>>>>> mute >>>>>>>>>>>> settings from setup, or from values set in the OS. Or some on= e >>>>>>>>>>>> is going >>>>>>>>>>>> to >>>>>>>>>>>> want the volume keys on the keyboard to work in EFI. >>>>>>>>>>>>=20 >>>>>>>>>>>> Also if you need to pick apart the Wave PCM 16 byte file to f= eed >>>>>>>>>>>> it into >>>>>>>>>>>> the >>>>>>>>>>>> audio hardware that probably means we should have a library t= hat >>>>>>>>>>>> does >>>>>>>>>>>> that >>>>>>>>>>>> work, so other Audio drivers can share that code. Also having= a >>>>>>>>>>>> library >>>>>>>>>>>> makes it easier to write a unit test. We need to be security >>>>>>>>>>>> conscious >>>>>>>>>>>> as we >>>>>>>>>>>> need to treat the Audo file as attacker controlled data. >>>>>>>>>>>>=20 >>>>>>>>>>>> Thanks, >>>>>>>>>>>>=20 >>>>>>>>>>>> Andrew Fish >>>>>>>>>>>>=20 >>>>>>>>>>>>> Michael >>>>>>>>>>>>>=20 >>>>>>>>>>>>>=20 >>>>>>>>>>>>>=20 >>>>>>>>>>>>>=20 >>>>>>>>>>>>>=20 >>>>>>>>>>>>=20 >>>>>>>>>>>>=20 >>>>>>>>>>>>=20 >>>>>>>>>>>>=20 >>>>>>>>>>>>=20 >>>>>>>>>>>>=20 >>>>>>>>>>>=20 >>>>>>>>>>> --=20 >>>>>>>>>>> Signed, >>>>>>>>>>> Ethin D. Probst >>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>=20 >>>>>=20 >>>>=20 >>>>=20 >>>> --=20 >>>> Signed, >>>> Ethin D. Probst >>>=20 >>=20 >=20 >=20 >=20 >=20 --Apple-Mail=_7356BD0C-473A-494E-9EE4-A226AB7F9F79 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On Apr 18, 2= 021, at 12:22 PM, Marvin H=C3=A4user <mhaeuser@posteo.de> wrote:

On 18.04.21 21:11, Marvin H=C3=A4user wrote:
On 18.04.21 17:22, Andrew Fish via groups.io wrote:


On Apr 18, 2021, at 1:55 AM, Ethin Probst <harlydavidsen@gmail.com<= /a> <mailto:harlyd= avidsen@gmail.com>> wrote:

I think it would be best to sketch use-cases fo= r audio and design the solutions closely to the requirements. Why do we nee= d to know when audio finished? What will happen when we queue audio twice? = There are many layers (UX, interface, implementation details) of questions = to coming up with a pleasant and stable design.
=

We are not using EFI to listen to music in the = background. Any audio being played is part of a UI element and there might = be synchronization requirements.

= Maybe I communicated that wrong, I'm not asking because I don't know what a= udio is used for, I am saying ideally there is a written-down list of usage= requirements before the protocol is designed, because that is what the des= ign targets. The details should follow the needs.


For example play= ing a boot bong on boot you want that to be asynchronous as you don=E2=80= =99t want to delay boot to play the sound, but you may want to chose to ga= te some UI elements on the boot bong completing. If you are building a menu= UI that is accessible you may need to synchronize playback with UI update,= but you may not want to make the slow sound playback blocking as you can g= et other UI work done in parallel.

The overhea= d for a caller making an async call is not much [1], but not having the cap= ability could really restrict the API for its intended use. I=E2=80=99d als= o point out we picked the same pattern as the async BlockIO and there is so= mething to said for having consistency in the UEFI Spec and have similar AP= Is work in similar ways.

Sorry a lot of the spam, but I someho= w missed the "consistency" point. Sorry, but there seems to be no real cons= istency. Block I/O and network things generally use the token event method = (what is suggested here), while USB, Bluetooth, and such generally pass a c= allback function directly (not necessarily what I suggest, as I don't know = the full requirements, but certainly one way).

=

The networking interfaces are ancient, and i= t looks like the recent BT stack leans into that model. Also those callback= s are about returning data in the form of unknown packets which is not the = problem we are trying to solve. 

T= he async Block IO is more of a model of notification of a queued event and = I think that maps better into what we are doing. The MP Services protocol f= rom the PI spec also uses an optional event to notify completion. At some p= oint we realized with a callback, or just and event it was hard to return a= n error status so that is why we ended up with the token in Block IO 2 so a= states could be returned after the completion event was signaled. 

There is also more flexibility with event= s as it lets define a GUID=E2=80=99ed event and broadcast this state to oth= er entities if you want. 

Thanks,<= /div>

Andrew Fish

Best regards,Marvin


I'm not saying there should be *no* async playback, I= am saying it may be worth considering implementing it differently from cal= ler-owned events. I'm not concerned with overhead, I'm concerned with point= s of failure (e.g. leaks).

I very briefly disc= ussed some things with Ethin and it seems like the default EDK II timer int= erval of 10 ms may be problematic, but I am not sure. Just leaving it here = as something to keep it mind.

Best regards,Marvin


[1] Overhead for making an asynchronous call.
AUDIO_TOKEN AudioToken;
gBS->CreateEvent  (= EVT_NOTIFY_SIGNAL, TPL_CALLBACK, NULL, NULL, &AudioToken.Event);

Thanks,

Andrew Fish

I would be h= appy to discuss this with you on the UEFI talkbox. I'm
draean= d on there.
As for your questions:

1. The only reason I recommend using an event to signal audio
completion is because I do not want this protocol to be blocking at<= br class=3D"">all. (So, perhaps removing the token entirely is a good idea.= ) The
VirtIO audio device says nothing about synchronization,= but I imagine
its asynchronous because every audio specifica= tion I've seen out there
is asynchronous. Similarly, every au= dio API in existence -- at least,
every low-level OS-specific= one -- is asynchronous/non-blocking.
(Usually, audio process= ing is handled on a separate thread.) However,
UEFI has no co= ncept of threads or processes. Though we could use the
MP PI = package to spin up a separate processor, that would fail on
u= niprocessor, unicore systems. Audio processing needs a high enough
priority that it gets first in the list of tasks served while
simultaneously not getting a priority that's so high that it blocks=
everything else. This is primarily because of the way an aud= io
subsystem is designed and the way an audio device function= s: the audio
subsystem needs to know, immediately, when the a= udio buffer has ran
out of samples and needs more, and it nee= ds to react immediately to
refill the buffer if required, esp= ecially when streaming large amounts
of audio (e.g.: music). = Similarly, the audio subsystem needs the
ability to react as = soon as is viable when playback is requested,
because any sig= nificant delay will be noticeable by the end-user. In
more co= mplex systems like FMOD or OpenAL, the audio processing thread
also needs a high priority to ensure that audio effects, positioning
information, dithering, etc., can be configured immediately becau= se
the user will notice if any glitches or delays occur. The = UEFI audio
protocols obviously will be nowhere near as comple= x, or as advanced,
because no one will need audio effects in = a preboot environment.
Granted, its possible to make small au= dio effects, for example delays,
even if the protocol doesn't= have functions to do that, but if an
end-user wants to go ab= solutely crazy with the audio samples and mix
in a really nic= e-sounding reverb or audio filter before sending the
samples = to the audio engine, well, that's what they want to do and
th= at's out of our hands as driver/protocol developers. But I digress.
UEFI only has four TPLs, and so what we hopefully want is an engine<= br class=3D"">that is able to manage sample buffering and transmission, but= also
doesn't block the application that's using the protocol= . For some
things, blocking might be acceptable, but for spee= ch synthesis or the
playing of startup sounds, this would not= be an acceptable result and
would make the protocol pretty m= uch worthless in the majority of
scenarios. So that's why I h= ad an event to signal audio completion --
it was (perhaps) a = cheap hack around the cooperatively-scheduled task
architectu= re of UEFI. (At least, I think its cooperative multitasking,
= correct me if I'm wrong.)
2. The VirtIO specification does no= t specify what occurs in the event
that a request is received= to play a stream that's already being
played. However, it do= es provide enough information for extrapolation.
Every reques= t that's sent to a VirtIO sound device must come with two
thi= ngs: a stream ID and a buffer of samples. The sample data must
immediately follow the request. Therefore, for VirtIO in particular,
the device will simply stop playing the old set of samples and pl= ay
the new set instead. This goes along with what I've seen i= n other
specifications like the HDA one: unless the device in= question
supports more than one stream, it is impossible to = play two sounds on
a single stream simultaneously, and an HDA= controller (for example) is
not going to perform any mixing;= mixing is done purely in software.
Similarly, if a device do= es support multiple streams, it is
unspecified whether the de= vice will play two or more streams
simultaneously or whether = it will pause/abort the playback of one
while it plays anothe= r. Therefore, I believe (though cannot confirm)
that OSes lik= e Windows simply use a single stream, even if the device
supp= orts multiple streams, and just makes the applications believe
that unlimited streams are possible.

I apolo= gize for this really long-winded email, and I hope no one minds. :-)

On 4/17/21, Marvin H=C3=A4user <mhaeuser@posteo.de <mailto:mhaeuser@posteo.de>> wrot= e:
On 17.04.21 19:31, An= drew Fish via groups.io <http://groups.io> wrote:


=
On Apr 17, 2021, at 9:51 AM, Marvin H= =C3=A4user <mhaeuser@p= osteo.de <mailto:mh= aeuser@posteo.de>
<mailto:mhaeuser@posteo.de <mailto:mhaeuser@posteo.de>>> wrote:
On 16.04.21 19:45, Ethin Probst wrote:
Yes, three APIs (maybe like thi= s) would work well:
- Start, Stop: begin playback of a stream=
- SetVolume, GetVolume, Mute, Unmute: control volume of outp= ut and
enable muting
- CreateStream, ReleaseStr= eam, SetStreamSampleRate: Control sample
rate of stream (but = not sample format since Signed 16-bit PCM is
enough)
Marvin, how do you suggest we make the events then? We need some way=
of notifying the caller that the stream has concluded. We co= uld make
the driver create the event and pass it back to the = caller as an
event, but you'd still have dangling pointers (t= his is C, after all).
We could just make a IsPlaying() functi= on and WaitForCompletion()
function and allow the driver to d= o the event handling -- would that
work?

I do not know enough about the possible use-cases to = tell. Aside from
the two functions you already mentioned, you= could also take in an
(optional) notification function.
Which possible use-cases does determining playback end have? If i= t's
too much effort, just use EFI_EVENT I guess, just the les= s code can
mess it up, the better.


In UEFI EFI_EVENT works much better. There i= s a gBS-WaitForEvent()
function that lets a caller wait on an= event. That is basically what
the UEFI Shell is doing at the= Shell prompt. A GUI in UEFI/C is
basically an event loop.
Fun fact: I ended up adding gIdleLoopEventG= uid to the MdeModulePkg so
the DXE Core could signal gIdleLoo= pEventGuid if you are sitting in
gBS-WaitForEvent() and no ev= ent is signaled. Basically in EFI nothing
is going to happen = until the next timer tick so the gIdleLoopEventGuid
lets you = idle the CPU until the next timer tick. I was forced to do
th= is as the 1st MacBook Air had a bad habit of thermal tripping when
sitting at the UEFI Shell prompt. After all another name for a loop = in
C code running on bare metal is a power virus.

Mac EFI is one of the best implementations w= e know of, frankly. I'm
traumatised by Aptio 4 and alike, whe= re (some issues are OEM-specific I
think) you can have timer = events signalling after ExitBS, there is event
clutter on IO = polling to the point where everything lags no matter what
you= do, and even in "smooth" scenarios there may be nothing worth the
description "granularity" (events scheduled to run every 10 ms may r= un
every 50 ms). Events are the last resort for us, if there = really is no
other way. My first GUI implementation worked wi= thout events at all for
this reason, but as our workarounds g= ot better, we did start using them
for keyboard and mouse pol= ling.

Timers do not apply here, but what does = apply is resource management.
Using EFI_EVENT directly means = (to the outside) the introduction of a
new resource to mainta= in, for each caller separately. On the other side,
there is n= o resource to misuse or leak if none such is exposed. Yet, if
you argue with APIs like WaitForEvent, something has to signal it. In asimple environment this would mean, some timer event is running= and may
signal the event the main code waits for, where abov= e's concern actually
do apply. :) Again, the recommendation a= ssumes the use-cases are simple
enough to easily avoid them.<= br class=3D"">
I think it would be best to sketch use-cases f= or audio and design the
solutions closely to the requirements= . Why do we need to know when audio
finished? What will happe= n when we queue audio twice? There are many
layers (UX, inter= face, implementation details) of questions to coming up
with = a pleasant and stable design.

Best regards,Marvin


Thanks,

Andrew Fish.<= br class=3D"">
If I reme= mber correctly you mentioned the UEFI Talkbox before, if
that= is more convenient for you, I'm there as mhaeuser.

Best regards,
Marvin


On 4/16/21, Andrew Fish <= ;afish@apple.com <mailto:afish@apple.com> &l= t;mailto:afish@apple.com = <mailto:afish@apple.com>>>
wrote:

On Apr 16, 2= 021, at 4:34 AM, Leif Lindholm <leif@nuviainc.com <mailto:leif@nuviainc.com>
<mailto:leif@nuviainc.com <mailto:leif@nuviainc.com>>&g= t; wrote:

Hi Ethin,

I think we also want to have a SetMode function, even if we don't getaround to implement proper support for it as part of GSoC (alth= ough I
expect at least for virtio, that should be pretty stra= ightforward).

Leif,

I=E2=80=99m think if we have an API to load the buffer and = a 2nd API to
play the
buffer an optional 3rd AP= I could configure the streams.

It's quite likely that speech for UI would be stored = as 8kHz (or
20kHz) in some systems, whereas the example for p= laying a tune in
GRUB
would more likely be a 44= .1 kHz mp3/wav/ogg/flac.

For the GSoC project,= I think it would be quite reasonable to
pre-generate pure PC= M streams for testing rather than decoding
anything on the fl= y.

Porting/writing decoders is really a separa= te task from enabling the
output. I would much rather see USB= *and* HDA support able to play
pcm
streams bef= ore worrying about decoding.

I ag= ree it might turn out it is easier to have the text to speech
code just
encode a PCM directly.

Thanks,

Andrew Fish

/
  &nb= sp;Leif

On Fri, Apr 16, 2021 at 00:33:06 -0500= , Ethin Probst wrote:
Th= anks for that explanation (I missed Mike's message). Earlier I
sent
a summary of those things that we can agree on: mainly= , that we have
mute, volume control, a load buffer, (maybe) a= n unload buffer, and a
start/stop stream function. Now that I= fully understand the
ramifications of this I don't mind sett= ling for a specific format
and
sample rate, and= signed 16-bit PCM audio is, I think, the most
widely
used one out there, besides 64-bit floating point samples, which
I've
only seen used in DAWs, and that's something = we don't need.
Are you sure you want the firmware itself to h= andle the decoding of
WAV audio? I can make a library class f= or that, but I'll definitely
need help with the security aspe= ct.

On 4/16/21, Andrew Fish via groups.io <http://groups.io> <http://groups.io <http://= groups.io>>
<afish=3Dapple.com@groups.io <mailto:afish=3Dapple.com@groups.i= o> <mai= lto:afish=3Dapple.com@groups.io <mailto:afish=3Dapple.com@groups.io>>>wrote:

On Apr 15, 2021, at 5:59 PM= , Michael Brown <mcb30@ipxe= .org <mailto:mcb30@ipxe= .org>
<mailto:mcb30@ipxe.org <mailto:mcb30@ipxe.org>>> wrote:

= On 16/04/2021 00:42, Ethin Probst wrote:
Forcing a particular channel mapping, sample rate and samp= le
format
on
everyone would compl= icate application code. From an
application point
of view, one would, with that type of protocol, need to do the
following:
1) Load an audio file in any audio file for= mat from any storage
mechanism.
2) Decode the a= udio file format to extract the samples and audio
metadata.3) Resample the (now decoded) audio samples and convert
(quantize)
the
audio samples into signe= d 16-bit PCM audio.
4) forward the samples onto the EFI audio= protocol.
You have made an incorrect assumption= that there exists a
requirement
to
be able to play audio files in arbitrary formats.  This
requirement
does
not exist.
With a protocol-mandated fixed baseline set of audio parameter= s
(sample
rate etc), what would happen in pract= ice is that the audio
files would
be
encoded in that format at *build* time, using tools entirely
external
to
UEFI.  The application = code is then trivially simple: it just does
"load
blob, pass blob to audio protocol".


Ethin,

Given the goal is = an industry standard we value interoperability
more
that
flexibility.

How abo= ut another use case. Lets say the Linux OS loader (Grub)
want= s
to
have an accessible UI so it decides to sor= e sound files on the EFI
System
Partition and u= se our new fancy UEFI Audio Protocol to add audio
to the
OS
loader GUI. So that version of Grub needs to wor= k on 1,000 of
different
PCs
and a= wide range of UEFI Audio driver implementations. It is a much
easier
world if Wave PCM 16 bit just works every place. You= could add a
lot of
complexity and try to encod= e the audio on the fly, maybe even in
Linux
pro= per but that falls down if you are booting from read only
med= ia like
a
DVD or backup tape (yes people still = do that in server land).

The other problem wit= h flexibility is you just made the test matrix
very
large for every driver that needs to get implemented. For
something as
complex as Intel HDA how you hook up the = hardware and what
CODECs you
use
= may impact the quality of the playback for a given board. Your
EFI is
likely
going to pick a single encoding= at that will get tested all the
time if
yoursystem has audio, but all 50 other things you support not somuch. So
that
will required testin= g, and some one with audiophile ears (or an AI
program)
to test all the combinations. I=E2=80=99m not kidding I get BZs on= the
quality
of
the boot bong on = our systems.


typedef struct E= FI_SIMPLE_AUDIO_PROTOCOL {
 EFI_SIMPLE_AUDIO_PROTOCOL_RE= SET Reset;
 EFI_SIMPLE_AUDIO_PROTOCOL_START Start;
 EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop;
} EFI_SIM= PLE_AUDIO_PROTOCOL;
This is now starting to look= like something that belongs in
boot-time
firmw= are.  :)

I think that got a = little too simple I=E2=80=99d go back and look at the
example=
I
posted to the thread but add an API to load = the buffer, and then
play
the
buf= fer (that way we can an API in the future to twiddle knobs).
= That
API
also implements the async EFI interfac= e. Trust me the 1st thing
that is
going to happ= en when we add audio is some one is going to
complain in
xyz
state we should mute audio, or we should honer = audio volume and
mute
settings from setup, or f= rom values set in the OS. Or some one
is going
= to
want the volume keys on the keyboard to work in EFI.

Also if you need to pick apart the Wave PCM 16 byte= file to feed
it into
the
audio h= ardware that probably means we should have a library that
doe= s
that
work, so other Audio drivers can share t= hat code. Also having a
library
makes it easier= to write a unit test. We need to be security
conscious
as we
need to treat the Audo file as attacker contro= lled data.

Thanks,

Andrew Fish

Michael



<= br class=3D"">






-- 
Signed,
Ethin D. Probst











<= /blockquote>

-- 
Signed,
Ethin D. Probst






--Apple-Mail=_7356BD0C-473A-494E-9EE4-A226AB7F9F79--