From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mout01.posteo.de (mout01.posteo.de [185.67.36.65]) by mx.groups.io with SMTP id smtpd.web08.10258.1618585425207882247 for ; Fri, 16 Apr 2021 08:03:46 -0700 Authentication-Results: mx.groups.io; dkim=fail reason="body hash did not verify" header.i=@posteo.de header.s=2017 header.b=Y7EcQuyF; spf=pass (domain: posteo.de, ip: 185.67.36.65, mailfrom: mhaeuser@posteo.de) Received: from submission (posteo.de [89.146.220.130]) by mout01.posteo.de (Postfix) with ESMTPS id 6E40624002A for ; Fri, 16 Apr 2021 17:03:42 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.de; s=2017; t=1618585422; bh=Ue+VUgMU2rwwylZIcvgTAEDPFOqVHn3vZlU+rAZyalw=; h=Subject:To:Cc:From:Date:From; b=Y7EcQuyF5jQl/7v49W3KDCg6G7xtAf/0JpquzUo/ky6P0L7huH440cKUR9NdtIHaA HOjqWoTUJUWqz6ouwtatltBltk5SKmFzMyx7gWKAwfS7qosBF+k5IFglTxOZHdHEST 6nzg0bxzMh1J931RrPpxlkVMNKPEfUf7tQ1nWIU5hFgK3dMiF8Fi3gk4GgsARqjlWW 6nVjLyf71Ek4mKuYovgyBLQa9IeYEMndw8u1GuA/UsDtFsz6vAcEKxFJr8K8nJ2kDd A0bASRWKklMoC3HZI6J2k1vJqVSCdhxALPBnc9OMyzNcPIfSSrFPqMRlNXRkS7TSGo 8MKT3IFlRYaxg== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4FMKGh2gJyz9rxL; Fri, 16 Apr 2021 17:03:40 +0200 (CEST) Subject: Re: [edk2-devel] VirtIO Sound Driver (GSoC 2021) To: devel@edk2.groups.io, afish@apple.com Cc: Ethin Probst , Michael Brown , Mike Kinney , Leif Lindholm , Laszlo Ersek , "Desimone, Nathaniel L" , Rafael Rodrigues Machado , Gerd Hoffmann References: <4AEC1784-99AF-47EF-B7DD-77F91EA3D7E9@apple.com> <309cc5ca-2ecd-79dd-b183-eec0572ea982@ipxe.org> <406e5bdb-f6aa-21ce-c96a-b16fb07c181d@posteo.de> <2296BE7E-ACEE-4286-9A5C-408B2D1ADC2E@apple.com> From: =?UTF-8?B?TWFydmluIEjDpHVzZXI=?= Message-ID: <22f0c186-b762-7790-92f6-e8347b73e26b@posteo.de> Date: Fri, 16 Apr 2021 15:03:39 +0000 MIME-Version: 1.0 In-Reply-To: <2296BE7E-ACEE-4286-9A5C-408B2D1ADC2E@apple.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable On 16.04.21 16:34, Andrew Fish via groups.io wrote: > > >> On Apr 16, 2021, at 6:22 AM, Marvin H=C3=A4user > > wrote: >> >> Good day, >> >> Sorry for the nitpicking. >> >> - Protocols always need a "Revision" field as first member. This is=20 >> used to be able to expand its capabilities in later revisions without= =20 >> introducing a new, distinct protocol. >> - Consider the name EFI_SIMPLE_AUDIO_OUTPUT(!)_PROTOCOL, to not cause= =20 >> confusion if input is ever added. Input in my opinion should be a=20 >> separate protocol as there is no reason why they would necessarily be= =20 >> coupled topology-wise (think of an USB microphone, it will never have= =20 >> any sort of output). >> - To make code safety a bit easier, try to use "CONST" for "IN"=20 >> (non-OUT) pointers, so that CONST can be propagated where possible. >> - Please do *not* make the events caller-owned. We had it multiple=20 >> times already on production firmware that events are left dangling=20 >> and may be polled/signaled after ExitBS(). The caller should be able=20 >> to decide on some policy maybe (i.e. abort or block on ExitBS() until= =20 >> the playback finished), as cut-off audio may be awkward; but the=20 >> callee definitely should implement "event safety" itself. Maybe avoid= =20 >> exposing events directly at all and provide nice abstractions the=20 >> caller cannot misuse. >> - I don't think audio should be required at all, the required subset=20 >> should firstly consider minimalism and security. Accessibility will=20 >> not be of concern for some IoT device, the audio code would simply=20 >> eat space, and introduce a larger surface for bugs. >> > > Marvin, > > Generally how we work this in the UEFI Specification is we make it=20 > optional via the following wording: =E2=80=9CIf a platform includes the= =20 > ability to play audio in EFI then the EFI_SIMPLE_AUDIO_OUTPUT_PROTOCOL= =20 > must be implemented. Yes, this sounds good. Just saying it should not be strictly mandatory=20 on all platforms. :) Thanks! Best regards, Marvin > > Basically this requirement will get added to UEFI Specification 2.6.2=20 > Platform-Specific Elements. > > Thanks, > > Andrew Fish > >> Best regards, >> Marvin >> >> On 16.04.21 01:42, Ethin Probst wrote: >>> Hi Andrew, >>> >>> What would that protocol interface look like if we utilized your idea? >>> With mine (though I need to add channel mapping as well), your >>> workflow for playing a stereo sound from left to right would probably >>> be something like this: >>> 1) Encode the sound using a standard tool into a Wave PCM 16. >>> 2) Place the Wave file in the Firmware Volume using a given UUID as >>> the name. As simple as editing the platform FDF file. >>> 3) Write some BDS code >>> a) Lookup Wave file by UUID and read it into memory. >>> b) Decode the audio file (audio devices will not do this decoding >>> for you, you have to do that yourself). >>> c) Call EFI_AUDIO_PROTOCOL.LoadBuffer(), passing in the sample rate >>> of your audio, EFI_AUDIO_PROTOCOL_SAMPLE_FORMAT_S16 for signed 16-bit >>> PCM audio, the channel mapping, the number of samples, and the samples >>> themselves. >>> =C2=A0=C2=A0d) call=20 >>> EFI_BOOT_SERVICES.CreateEvent()/EFI_BOOT_SERVICES.CreateEventEx() >>> for a playback event to signal. >>> =C2=A0=C2=A0e) call EFI_AUDIO_PROTOCOL.StartPlayback(), passing in the= event you >>> just created. >>> The reason that LoadBuffer() takes so many parameters is because the >>> device does not know the audio that your passing in. If I'm given an >>> array of 16-bit audio samples, its impossible to know the parameters >>> (sample rate, sample format, channel mapping, etc.) from that alone. >>> Using your idea, though, my protocol could be greatly simplified. >>> Forcing a particular channel mapping, sample rate and sample format on >>> everyone would complicate application code. From an application point >>> of view, one would, with that type of protocol, need to do the >>> following: >>> 1) Load an audio file in any audio file format from any storage=20 >>> mechanism. >>> 2) Decode the audio file format to extract the samples and audio=20 >>> metadata. >>> 3) Resample the (now decoded) audio samples and convert (quantize) the >>> audio samples into signed 16-bit PCM audio. >>> 4) forward the samples onto the EFI audio protocol. >>> There is another option. (I'm happy we're discussing this now -- we >>> can hammer out all the details now which will make a lot of things >>> easier.) Since I'll most likely end up splitting the device-specific >>> interfaces to different audio protocols, we could make a simple audio >>> protocol that makes various assumptions about the audio samples your >>> giving it (e.g.: sample rate, format, ...). This would just allow >>> audio output and input in signed 16-bit PCM audio, as you've >>> suggested, and would be a simple and easy to use interface. Something >>> like: >>> typedef struct EFI_SIMPLE_AUDIO_PROTOCOL { >>> =C2=A0=C2=A0EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset; >>> =C2=A0=C2=A0EFI_SIMPLE_AUDIO_PROTOCOL_START Start; >>> =C2=A0=C2=A0EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop; >>> } EFI_SIMPLE_AUDIO_PROTOCOL; >>> This way, users and driver developers have a simple audio protocol >>> they can implement if they like. It would assume signed 16-bit PCM >>> audio and mono channel mappings at 44100 Hz. Then, we can have an >>> advanced protocol for each device type (HDA, USB, VirtIO, ...) that >>> exposes all the knobs for sample formats, sample rates, that kind of >>> thing. Obviously, like the majority (if not all) UEFI protocols, these >>> advanced protocols would be optional. I think, however, that the >>> simple audio protocol should be a required protocol in all UEFI >>> implementations. But that might not be possible. So would this simpler >>> interface work as a starting point? >>> >>> On 4/15/21, Andrew Fish >=20 >>> wrote: >>>> >>>>> On Apr 15, 2021, at 1:11 PM, Ethin Probst >>>> > >>>>> wrote: >>>>> >>>>>> Is there any necessity for audio input and output to be implemented >>>>>> within the same protocol? =C2=A0Unlike a network device (which is >>>>>> intrinsically bidirectional), it seems natural to conceptually=20 >>>>>> separate >>>>>> audio input from audio output. >>>>> Nope, there isn't a necessity to make them in one, they can be >>>>> separated into two. >>>>> >>>>>> The code controlling volume/mute may not have any access to the=20 >>>>>> sample >>>>>> buffer. =C2=A0The most natural implementation would seem to allow f= or a >>>>>> platform to notice volume up/down keypresses and use those to=20 >>>>>> control the >>>>>> overall system volume, without any knowledge of which samples (if= =20 >>>>>> any) >>>>>> are currently being played by other code in the system. >>>>> Your assuming that the audio device your implementing the >>>>> volume/muting has volume control and muting functionality within >>>>> itself, then. >>>> Not really. We are assuming that audio hardware has a better=20 >>>> understanding >>>> of how that system implements volume than some generic EFI Code=20 >>>> that is by >>>> definition platform agnostic. >>>> >>>>> This may not be the case, and so we'd need to >>>>> effectively simulate it within the driver, which isn't too hard to d= o. >>>>> As an example, the VirtIO driver does not have a request type for >>>>> muting or for volume control (this would, most likely, be within the >>>>> VIRTIO_SND_R_PCM_SET_PARAMS request, see sec. 5.14.6.4.3). Therefore= , >>>>> either the driver would have to simulate the request or return >>>>> EFI_UNSUPPORTED this instance. >>>>> >>>> So this is an example of above since the audio hardware knows it is= =20 >>>> routing >>>> the sound output into another subsystem, and that subsystem=20 >>>> controls the >>>> volume. So the VirtIo Sound Driver know best how to bstract=20 >>>> volume/mute for >>>> this platform. >>>> >>>>>> Consider also the point of view of the developer implementing a=20 >>>>>> driver >>>>>> for some other piece of audio hardware that happens not to support >>>>>> precisely the same sample rates etc as VirtIO. =C2=A0It would be ex= tremely >>>>>> ugly to force all future hardware to pretend to have the same >>>>>> capabilities as VirtIO just because the API was initially=20 >>>>>> designed with >>>>>> VirtIO in mind. >>>>> Precisely, but the brilliance of VirtIO >>>> The brilliance of VirtIO is that it just needs to implement a=20 >>>> generic device >>>> driver for a given operating system. In most cases these operating=20 >>>> systems >>>> have sounds subsystems that manage sound and want fine granularity of >>>> control on what is going on. So the drivers are implemented to=20 >>>> maximizes >>>> flexibility since the OS has lots of generic code that deals with=20 >>>> sound, and >>>> even user configurable knobs to control audio. In our case that=20 >>>> extra layer >>>> does not exist in EFI and the end user code just want to tell the=20 >>>> driver do >>>> some simple things. >>>> >>>> Maybe it is easier to think about with an example. Lets say I want=20 >>>> to play a >>>> cool sound on every boot. What would be the workflow to make the=20 >>>> happen. >>>> 1) Encode the sound using a standard tool into a Wave PCM 16. >>>> 2) Place the Wave file in the Firmware Volume using a given UUID as t= he >>>> name. As simple as editing the platform FDF file. >>>> 3) Write some BDS code >>>> =C2=A0=C2=A0a) Lookup Wave file by UUID and read it into memory. >>>> =C2=A0=C2=A0b) Point the EFI Sound Protocol at the buffer with the Wa= ve file >>>> =C2=A0=C2=A0c) Tell the EFI Sound Protocol to play the sound. >>>> >>>> If you start adding in a lot of perimeters that work flow starts=20 >>>> getting >>>> really complicated really quickly. >>>> >>>>> is that the sample rate, >>>>> sample format, &c., do not have to all be supported by a VirtIO >>>>> device. Notice, also, how in my protocol proposal I noted that the >>>>> sample rates, at least, were "recommended," not "required." Should a >>>>> device not happen to support a sample rate or sample format, all it >>>>> needs to do is return EFI_INVALID_PARAMETER. Section 5.14.6.2.1 >>>>> (VIRTIO_SND_R_JACK_GET_CONFIG) describes how a jack tells you what >>>>> sample rates it supports, channel mappings, &c. >>>>> >>>>> I do understand how just using a predefined sample rate and sample >>>>> format might be a good idea, and if that's the best way, then that's >>>>> what we'll do. The protocol can always be revised at a later time if >>>>> necessary. I apologize if my stance seems obstinate. >>>>> >>>> I think if we add the version into the protocol and make sure we have= a >>>> separate load and play operation we could add a member to set the ext= ra >>>> perimeters if needed. There might also be some platform specific=20 >>>> generic >>>> tunables that might be interesting for a future member function. >>>> >>>> Thanks, >>>> >>>> Andrew Fish >>>> >>>>> Also, thank you, Laszlo, for your advice -- I hadn't considered that= a >>>>> network driver would be another good way of figuring out how async >>>>> works in UEFI. >>>>> >>>>> On 4/15/21, Andrew Fish >= =20 >>>>> wrote: >>>>>> >>>>>>> On Apr 15, 2021, at 5:07 AM, Michael Brown >>>>>> > wrote: >>>>>>> >>>>>>> On 15/04/2021 06:28, Ethin Probst wrote: >>>>>>>> - I hoped to add recording in case we in future want to add >>>>>>>> accessibility aids like speech recognition (that was one of the= =20 >>>>>>>> todo >>>>>>>> tasks on the EDK2 tasks list) >>>>>>> Is there any necessity for audio input and output to be implemente= d >>>>>>> within >>>>>>> the same protocol? =C2=A0Unlike a network device (which is intrins= ically >>>>>>> bidirectional), it seems natural to conceptually separate audio=20 >>>>>>> input >>>>>>> from >>>>>>> audio output. >>>>>>> >>>>>>>> - Muting and volume control could easily be added by just replaci= ng >>>>>>>> the sample buffer with silence and by multiplying all the=20 >>>>>>>> samples by a >>>>>>>> percentage. >>>>>>> The code controlling volume/mute may not have any access to the=20 >>>>>>> sample >>>>>>> buffer. =C2=A0The most natural implementation would seem to allow = for a >>>>>>> platform to notice volume up/down keypresses and use those to=20 >>>>>>> control >>>>>>> the >>>>>>> overall system volume, without any knowledge of which samples=20 >>>>>>> (if any) >>>>>>> are >>>>>>> currently being played by other code in the system. >>>>>>> >>>>>> I=E2=80=99ve also thought of adding NVRAM variable that would let s= etup, UEFI >>>>>> Shell, >>>>>> or even the OS set the current volume, and Mute. This how it would = be >>>>>> consumed concept is why I proposed mute and volume being separate= =20 >>>>>> APIs. >>>>>> The >>>>>> volume up/down API in addition to fixed percentage might be=20 >>>>>> overkill, but >>>>>> it >>>>>> does allow a non liner mapping to the volume up/down keys. You=20 >>>>>> would be >>>>>> surprised how picky audiophiles can be and it seems they like to fi= le >>>>>> Bugzillas. >>>>>> >>>>>>>> - Finally, the reason I used enumerations for specifying paramete= rs >>>>>>>> like sample rate and stuff was that I was looking at this protoco= l >>>>>>>> from a general UEFI applications point of view. VirtIO supports= =20 >>>>>>>> all of >>>>>>>> the sample configurations listed in my gist, and it seems=20 >>>>>>>> reasonable >>>>>>>> to allow the application to control those parameters instead of >>>>>>>> forcing a particular parameter configuration onto the developer. >>>>>>> Consider also the point of view of the developer implementing a=20 >>>>>>> driver >>>>>>> for >>>>>>> some other piece of audio hardware that happens not to support >>>>>>> precisely >>>>>>> the same sample rates etc as VirtIO. =C2=A0It would be extremely u= gly to >>>>>>> force >>>>>>> all future hardware to pretend to have the same capabilities as=20 >>>>>>> VirtIO >>>>>>> just because the API was initially designed with VirtIO in mind. >>>>>>> >>>>>>> As a developer on the other side of the API, writing code to=20 >>>>>>> play sound >>>>>>> files on an arbitrary unknown platform, I would prefer to simply >>>>>>> consume >>>>>>> as simple as possible an abstraction of an audio output protocol a= nd >>>>>>> not >>>>>>> have to care about what hardware is actually implementing it. >>>>>>> >>>>>> It may make sense to have an API to load the buffer/stream and=20 >>>>>> other APIs >>>>>> to >>>>>> play or pause. This could allow an optional API to configure how th= e >>>>>> stream >>>>>> is played back. If we add a version to the Protocol that would at= =20 >>>>>> least >>>>>> future proof us. >>>>>> >>>>>> We did get feedback that it is very common to speed up the auto=20 >>>>>> playback >>>>>> rates for accessibility. I=E2=80=99m not sure if that is practical = with a=20 >>>>>> simple >>>>>> PCM >>>>>> 16 wave file with the firmware audio implementation. I guess that i= s >>>>>> something we could investigate. >>>>>> >>>>>> In terms of maybe adding text to speech there is an open source=20 >>>>>> project >>>>>> that >>>>>> conceptually we could port to EFI. It would likely be a binary that >>>>>> would >>>>>> have to live on the EFI System Partition due to size. I was thinkin= g >>>>>> that >>>>>> words per minute could be part of that API and it would produce a= =20 >>>>>> PCM 16 >>>>>> wave file that the audio protocol we are discussing could play. >>>>>> >>>>>>> Both of these argue in favour of defining a very simple API that >>>>>>> expresses >>>>>>> only a common baseline capability that is plausibly=20 >>>>>>> implementable for >>>>>>> every piece of audio hardware ever made. >>>>>>> >>>>>>> Coupled with the relatively minimalistic requirements for boot-tim= e >>>>>>> audio, >>>>>>> I'd probably suggest supporting only a single format for audio dat= a, >>>>>>> with >>>>>>> a fixed sample rate (and possibly only mono output). >>>>>>> >>>>>> In my world the folks that work for Jony asked for a stereo boot=20 >>>>>> bong to >>>>>> transition from left to right :). This is not the CODEC you are=20 >>>>>> looking >>>>>> for >>>>>> was our response=E2=80=A6. I also did not mention that some languag= es are=20 >>>>>> right >>>>>> to >>>>>> left, as the only thing worse than one complex thing is two complex >>>>>> things >>>>>> to implement. >>>>>> >>>>>>> As always: perfection is achieved, not when there is nothing more = to >>>>>>> add, >>>>>>> but when there is nothing left to take away. =C2=A0:) >>>>>>> >>>>>> "Simplicity is the ultimate sophistication=E2=80=9D >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Andrew Fish >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Michael >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> Signed, >>>>> Ethin D. Probst >>>> >>> >> >> >> > >=20