From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from rn-mailsvcp-ppex-lapp34.apple.com (rn-mailsvcp-ppex-lapp34.apple.com [17.179.253.43]) by mx.groups.io with SMTP id smtpd.web11.9855.1618583672703872840 for ; Fri, 16 Apr 2021 07:34:32 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@apple.com header.s=20180706 header.b=sPhZiMFf; spf=pass (domain: apple.com, ip: 17.179.253.43, mailfrom: afish@apple.com) Received: from pps.filterd (rn-mailsvcp-ppex-lapp34.rno.apple.com [127.0.0.1]) by rn-mailsvcp-ppex-lapp34.rno.apple.com (8.16.1.2/8.16.1.2) with SMTP id 13GEXhVn012852; Fri, 16 Apr 2021 07:34:32 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=from : message-id : content-type : mime-version : subject : date : in-reply-to : cc : to : references; s=20180706; bh=V10UptC5nK+BsVySUnmhBm0an5VHoRTmOZC5cAoXSAw=; b=sPhZiMFfx18UmRo/jKlu4oYdla8C4V69dyGlSMDXQwnR5C7XW5u/jQLyX79DItiZ7D3Y qPTAt/T7Mw4SoqQW11Py1Xm9pOJwBfBk7WJ0PH+vkdDFYzVo1g4QRzaxaZhN/yk3URJU Jc7/O53cpPnM9YbGRR+qBWkttXLN9qn3LKPZQfLXsDZOpbvbKL4byWGhGBT4WIYTwRAb SQLnBQdAi60YChLDfOvPPLHCfcvhntn/FyhiZu4gsY1y7mKBiz4g3kDgaxVXU7Bk4/72 jKTj2I6g+/+vA54EvW+1yzYEy9bFJr5mfgXIdyqAJRwBD89M21yB3CqIEY6/R5Jy/mlt jg== Received: from rn-mailsvcp-mta-lapp04.rno.apple.com (rn-mailsvcp-mta-lapp04.rno.apple.com [10.225.203.152]) by rn-mailsvcp-ppex-lapp34.rno.apple.com with ESMTP id 37u7v3ppae-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Fri, 16 Apr 2021 07:34:32 -0700 Received: from rn-mailsvcp-mmp-lapp01.rno.apple.com (rn-mailsvcp-mmp-lapp01.rno.apple.com [17.179.253.14]) by rn-mailsvcp-mta-lapp04.rno.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) with ESMTPS id <0QRN00A00V5IG1A0@rn-mailsvcp-mta-lapp04.rno.apple.com>; Fri, 16 Apr 2021 07:34:30 -0700 (PDT) Received: from process_milters-daemon.rn-mailsvcp-mmp-lapp01.rno.apple.com by rn-mailsvcp-mmp-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) id <0QRN00300UTKDZ00@rn-mailsvcp-mmp-lapp01.rno.apple.com>; Fri, 16 Apr 2021 07:34:30 -0700 (PDT) X-Va-A: X-Va-T-CD: 9ad46be6e1c3c1a24e92ea4dad46d58d X-Va-E-CD: 4730c80ee67030d4f2c83e40b4ab0357 X-Va-R-CD: 6f0325faf294bd23a6d751c620be9d51 X-Va-CD: 0 X-Va-ID: 1d7b60dd-fdbf-43fe-bf6d-3c984d38c7ad X-V-A: X-V-T-CD: 9ad46be6e1c3c1a24e92ea4dad46d58d X-V-E-CD: 4730c80ee67030d4f2c83e40b4ab0357 X-V-R-CD: 6f0325faf294bd23a6d751c620be9d51 X-V-CD: 0 X-V-ID: d1eb756c-42fd-428a-b94f-2143d525b461 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.761 definitions=2021-04-16_07:2021-04-16,2021-04-16 signatures=0 Received: from [17.235.19.21] (unknown [17.235.19.21]) by rn-mailsvcp-mmp-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) with ESMTPSA id <0QRN00VKMV5FAA00@rn-mailsvcp-mmp-lapp01.rno.apple.com>; Fri, 16 Apr 2021 07:34:29 -0700 (PDT) From: "Andrew Fish" Message-id: <2296BE7E-ACEE-4286-9A5C-408B2D1ADC2E@apple.com> MIME-version: 1.0 (Mac OS X Mail 14.0 \(3654.20.0.2.1\)) Subject: Re: [edk2-devel] VirtIO Sound Driver (GSoC 2021) Date: Fri, 16 Apr 2021 07:34:27 -0700 In-reply-to: <406e5bdb-f6aa-21ce-c96a-b16fb07c181d@posteo.de> Cc: Ethin Probst , Michael Brown , Mike Kinney , Leif Lindholm , Laszlo Ersek , "Desimone, Nathaniel L" , Rafael Rodrigues Machado , Gerd Hoffmann To: edk2-devel-groups-io , =?utf-8?Q?Marvin_H=C3=A4user?= References: <16758FB6436B1195.32393@groups.io> <4AEC1784-99AF-47EF-B7DD-77F91EA3D7E9@apple.com> <309cc5ca-2ecd-79dd-b183-eec0572ea982@ipxe.org> <406e5bdb-f6aa-21ce-c96a-b16fb07c181d@posteo.de> X-Mailer: Apple Mail (2.3654.20.0.2.1) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.761 definitions=2021-04-16_07:2021-04-16,2021-04-16 signatures=0 Content-type: multipart/alternative; boundary="Apple-Mail=_C411943F-CDA1-45E8-BB2A-B3625A4AA9FB" --Apple-Mail=_C411943F-CDA1-45E8-BB2A-B3625A4AA9FB Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Apr 16, 2021, at 6:22 AM, Marvin H=C3=A4user wro= te: >=20 > Good day, >=20 > Sorry for the nitpicking. >=20 > - Protocols always need a "Revision" field as first member. This is used= to be able to expand its capabilities in later revisions without introduci= ng a new, distinct protocol. > - Consider the name EFI_SIMPLE_AUDIO_OUTPUT(!)_PROTOCOL, to not cause co= nfusion if input is ever added. Input in my opinion should be a separate pr= otocol as there is no reason why they would necessarily be coupled topology= -wise (think of an USB microphone, it will never have any sort of output). > - To make code safety a bit easier, try to use "CONST" for "IN" (non-OUT= ) pointers, so that CONST can be propagated where possible. > - Please do *not* make the events caller-owned. We had it multiple times= already on production firmware that events are left dangling and may be po= lled/signaled after ExitBS(). The caller should be able to decide on some p= olicy maybe (i.e. abort or block on ExitBS() until the playback finished), = as cut-off audio may be awkward; but the callee definitely should implement= "event safety" itself. Maybe avoid exposing events directly at all and pro= vide nice abstractions the caller cannot misuse. > - I don't think audio should be required at all, the required subset sho= uld firstly consider minimalism and security. Accessibility will not be of = concern for some IoT device, the audio code would simply eat space, and int= roduce a larger surface for bugs. >=20 Marvin, Generally how we work this in the UEFI Specification is we make it optiona= l via the following wording: =E2=80=9CIf a platform includes the ability to= play audio in EFI then the EFI_SIMPLE_AUDIO_OUTPUT_PROTOCOL must be implem= ented.=20 Basically this requirement will get added to UEFI Specification 2.6.2 Plat= form-Specific Elements. Thanks, Andrew Fish > Best regards, > Marvin >=20 > On 16.04.21 01:42, Ethin Probst wrote: >> Hi Andrew, >>=20 >> What would that protocol interface look like if we utilized your idea? >> With mine (though I need to add channel mapping as well), your >> workflow for playing a stereo sound from left to right would probably >> be something like this: >> 1) Encode the sound using a standard tool into a Wave PCM 16. >> 2) Place the Wave file in the Firmware Volume using a given UUID as >> the name. As simple as editing the platform FDF file. >> 3) Write some BDS code >> a) Lookup Wave file by UUID and read it into memory. >> b) Decode the audio file (audio devices will not do this decoding >> for you, you have to do that yourself). >> c) Call EFI_AUDIO_PROTOCOL.LoadBuffer(), passing in the sample rate >> of your audio, EFI_AUDIO_PROTOCOL_SAMPLE_FORMAT_S16 for signed 16-bit >> PCM audio, the channel mapping, the number of samples, and the samples >> themselves. >> d) call EFI_BOOT_SERVICES.CreateEvent()/EFI_BOOT_SERVICES.CreateEvent= Ex() >> for a playback event to signal. >> e) call EFI_AUDIO_PROTOCOL.StartPlayback(), passing in the event you >> just created. >> The reason that LoadBuffer() takes so many parameters is because the >> device does not know the audio that your passing in. If I'm given an >> array of 16-bit audio samples, its impossible to know the parameters >> (sample rate, sample format, channel mapping, etc.) from that alone. >> Using your idea, though, my protocol could be greatly simplified. >> Forcing a particular channel mapping, sample rate and sample format on >> everyone would complicate application code. From an application point >> of view, one would, with that type of protocol, need to do the >> following: >> 1) Load an audio file in any audio file format from any storage mechani= sm. >> 2) Decode the audio file format to extract the samples and audio metada= ta. >> 3) Resample the (now decoded) audio samples and convert (quantize) the >> audio samples into signed 16-bit PCM audio. >> 4) forward the samples onto the EFI audio protocol. >> There is another option. (I'm happy we're discussing this now -- we >> can hammer out all the details now which will make a lot of things >> easier.) Since I'll most likely end up splitting the device-specific >> interfaces to different audio protocols, we could make a simple audio >> protocol that makes various assumptions about the audio samples your >> giving it (e.g.: sample rate, format, ...). This would just allow >> audio output and input in signed 16-bit PCM audio, as you've >> suggested, and would be a simple and easy to use interface. Something >> like: >> typedef struct EFI_SIMPLE_AUDIO_PROTOCOL { >> EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset; >> EFI_SIMPLE_AUDIO_PROTOCOL_START Start; >> EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop; >> } EFI_SIMPLE_AUDIO_PROTOCOL; >> This way, users and driver developers have a simple audio protocol >> they can implement if they like. It would assume signed 16-bit PCM >> audio and mono channel mappings at 44100 Hz. Then, we can have an >> advanced protocol for each device type (HDA, USB, VirtIO, ...) that >> exposes all the knobs for sample formats, sample rates, that kind of >> thing. Obviously, like the majority (if not all) UEFI protocols, these >> advanced protocols would be optional. I think, however, that the >> simple audio protocol should be a required protocol in all UEFI >> implementations. But that might not be possible. So would this simpler >> interface work as a starting point? >>=20 >> On 4/15/21, Andrew Fish wrote: >>>=20 >>>> On Apr 15, 2021, at 1:11 PM, Ethin Probst >>>> wrote: >>>>=20 >>>>> Is there any necessity for audio input and output to be implemented >>>>> within the same protocol? Unlike a network device (which is >>>>> intrinsically bidirectional), it seems natural to conceptually separ= ate >>>>> audio input from audio output. >>>> Nope, there isn't a necessity to make them in one, they can be >>>> separated into two. >>>>=20 >>>>> The code controlling volume/mute may not have any access to the samp= le >>>>> buffer. The most natural implementation would seem to allow for a >>>>> platform to notice volume up/down keypresses and use those to contro= l the >>>>> overall system volume, without any knowledge of which samples (if an= y) >>>>> are currently being played by other code in the system. >>>> Your assuming that the audio device your implementing the >>>> volume/muting has volume control and muting functionality within >>>> itself, then. >>> Not really. We are assuming that audio hardware has a better understan= ding >>> of how that system implements volume than some generic EFI Code that i= s by >>> definition platform agnostic. >>>=20 >>>> This may not be the case, and so we'd need to >>>> effectively simulate it within the driver, which isn't too hard to do= . >>>> As an example, the VirtIO driver does not have a request type for >>>> muting or for volume control (this would, most likely, be within the >>>> VIRTIO_SND_R_PCM_SET_PARAMS request, see sec. 5.14.6.4.3). Therefore, >>>> either the driver would have to simulate the request or return >>>> EFI_UNSUPPORTED this instance. >>>>=20 >>> So this is an example of above since the audio hardware knows it is ro= uting >>> the sound output into another subsystem, and that subsystem controls t= he >>> volume. So the VirtIo Sound Driver know best how to bstract volume/mut= e for >>> this platform. >>>=20 >>>>> Consider also the point of view of the developer implementing a driv= er >>>>> for some other piece of audio hardware that happens not to support >>>>> precisely the same sample rates etc as VirtIO. It would be extremel= y >>>>> ugly to force all future hardware to pretend to have the same >>>>> capabilities as VirtIO just because the API was initially designed w= ith >>>>> VirtIO in mind. >>>> Precisely, but the brilliance of VirtIO >>> The brilliance of VirtIO is that it just needs to implement a generic = device >>> driver for a given operating system. In most cases these operating sys= tems >>> have sounds subsystems that manage sound and want fine granularity of >>> control on what is going on. So the drivers are implemented to maximiz= es >>> flexibility since the OS has lots of generic code that deals with soun= d, and >>> even user configurable knobs to control audio. In our case that extra = layer >>> does not exist in EFI and the end user code just want to tell the driv= er do >>> some simple things. >>>=20 >>> Maybe it is easier to think about with an example. Lets say I want to = play a >>> cool sound on every boot. What would be the workflow to make the happe= n. >>> 1) Encode the sound using a standard tool into a Wave PCM 16. >>> 2) Place the Wave file in the Firmware Volume using a given UUID as th= e >>> name. As simple as editing the platform FDF file. >>> 3) Write some BDS code >>> a) Lookup Wave file by UUID and read it into memory. >>> b) Point the EFI Sound Protocol at the buffer with the Wave file >>> c) Tell the EFI Sound Protocol to play the sound. >>>=20 >>> If you start adding in a lot of perimeters that work flow starts getti= ng >>> really complicated really quickly. >>>=20 >>>> is that the sample rate, >>>> sample format, &c., do not have to all be supported by a VirtIO >>>> device. Notice, also, how in my protocol proposal I noted that the >>>> sample rates, at least, were "recommended," not "required." Should a >>>> device not happen to support a sample rate or sample format, all it >>>> needs to do is return EFI_INVALID_PARAMETER. Section 5.14.6.2.1 >>>> (VIRTIO_SND_R_JACK_GET_CONFIG) describes how a jack tells you what >>>> sample rates it supports, channel mappings, &c. >>>>=20 >>>> I do understand how just using a predefined sample rate and sample >>>> format might be a good idea, and if that's the best way, then that's >>>> what we'll do. The protocol can always be revised at a later time if >>>> necessary. I apologize if my stance seems obstinate. >>>>=20 >>> I think if we add the version into the protocol and make sure we have = a >>> separate load and play operation we could add a member to set the extr= a >>> perimeters if needed. There might also be some platform specific gener= ic >>> tunables that might be interesting for a future member function. >>>=20 >>> Thanks, >>>=20 >>> Andrew Fish >>>=20 >>>> Also, thank you, Laszlo, for your advice -- I hadn't considered that = a >>>> network driver would be another good way of figuring out how async >>>> works in UEFI. >>>>=20 >>>> On 4/15/21, Andrew Fish wrote: >>>>>=20 >>>>>> On Apr 15, 2021, at 5:07 AM, Michael Brown wrote: >>>>>>=20 >>>>>> On 15/04/2021 06:28, Ethin Probst wrote: >>>>>>> - I hoped to add recording in case we in future want to add >>>>>>> accessibility aids like speech recognition (that was one of the to= do >>>>>>> tasks on the EDK2 tasks list) >>>>>> Is there any necessity for audio input and output to be implemented >>>>>> within >>>>>> the same protocol? Unlike a network device (which is intrinsically >>>>>> bidirectional), it seems natural to conceptually separate audio inp= ut >>>>>> from >>>>>> audio output. >>>>>>=20 >>>>>>> - Muting and volume control could easily be added by just replacin= g >>>>>>> the sample buffer with silence and by multiplying all the samples = by a >>>>>>> percentage. >>>>>> The code controlling volume/mute may not have any access to the sam= ple >>>>>> buffer. The most natural implementation would seem to allow for a >>>>>> platform to notice volume up/down keypresses and use those to contr= ol >>>>>> the >>>>>> overall system volume, without any knowledge of which samples (if a= ny) >>>>>> are >>>>>> currently being played by other code in the system. >>>>>>=20 >>>>> I=E2=80=99ve also thought of adding NVRAM variable that would let se= tup, UEFI >>>>> Shell, >>>>> or even the OS set the current volume, and Mute. This how it would b= e >>>>> consumed concept is why I proposed mute and volume being separate AP= Is. >>>>> The >>>>> volume up/down API in addition to fixed percentage might be overkill= , but >>>>> it >>>>> does allow a non liner mapping to the volume up/down keys. You would= be >>>>> surprised how picky audiophiles can be and it seems they like to fil= e >>>>> Bugzillas. >>>>>=20 >>>>>>> - Finally, the reason I used enumerations for specifying parameter= s >>>>>>> like sample rate and stuff was that I was looking at this protocol >>>>>>> from a general UEFI applications point of view. VirtIO supports al= l of >>>>>>> the sample configurations listed in my gist, and it seems reasonab= le >>>>>>> to allow the application to control those parameters instead of >>>>>>> forcing a particular parameter configuration onto the developer. >>>>>> Consider also the point of view of the developer implementing a dri= ver >>>>>> for >>>>>> some other piece of audio hardware that happens not to support >>>>>> precisely >>>>>> the same sample rates etc as VirtIO. It would be extremely ugly to >>>>>> force >>>>>> all future hardware to pretend to have the same capabilities as Vir= tIO >>>>>> just because the API was initially designed with VirtIO in mind. >>>>>>=20 >>>>>> As a developer on the other side of the API, writing code to play s= ound >>>>>> files on an arbitrary unknown platform, I would prefer to simply >>>>>> consume >>>>>> as simple as possible an abstraction of an audio output protocol an= d >>>>>> not >>>>>> have to care about what hardware is actually implementing it. >>>>>>=20 >>>>> It may make sense to have an API to load the buffer/stream and other= APIs >>>>> to >>>>> play or pause. This could allow an optional API to configure how the >>>>> stream >>>>> is played back. If we add a version to the Protocol that would at le= ast >>>>> future proof us. >>>>>=20 >>>>> We did get feedback that it is very common to speed up the auto play= back >>>>> rates for accessibility. I=E2=80=99m not sure if that is practical w= ith a simple >>>>> PCM >>>>> 16 wave file with the firmware audio implementation. I guess that is >>>>> something we could investigate. >>>>>=20 >>>>> In terms of maybe adding text to speech there is an open source proj= ect >>>>> that >>>>> conceptually we could port to EFI. It would likely be a binary that >>>>> would >>>>> have to live on the EFI System Partition due to size. I was thinking >>>>> that >>>>> words per minute could be part of that API and it would produce a PC= M 16 >>>>> wave file that the audio protocol we are discussing could play. >>>>>=20 >>>>>> Both of these argue in favour of defining a very simple API that >>>>>> expresses >>>>>> only a common baseline capability that is plausibly implementable f= or >>>>>> every piece of audio hardware ever made. >>>>>>=20 >>>>>> Coupled with the relatively minimalistic requirements for boot-time >>>>>> audio, >>>>>> I'd probably suggest supporting only a single format for audio data= , >>>>>> with >>>>>> a fixed sample rate (and possibly only mono output). >>>>>>=20 >>>>> In my world the folks that work for Jony asked for a stereo boot bon= g to >>>>> transition from left to right :). This is not the CODEC you are look= ing >>>>> for >>>>> was our response=E2=80=A6. I also did not mention that some language= s are right >>>>> to >>>>> left, as the only thing worse than one complex thing is two complex >>>>> things >>>>> to implement. >>>>>=20 >>>>>> As always: perfection is achieved, not when there is nothing more t= o >>>>>> add, >>>>>> but when there is nothing left to take away. :) >>>>>>=20 >>>>> "Simplicity is the ultimate sophistication=E2=80=9D >>>>>=20 >>>>> Thanks, >>>>>=20 >>>>> Andrew Fish >>>>>=20 >>>>>> Thanks, >>>>>>=20 >>>>>> Michael >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>=20 >>>>=20 >>>> -- >>>> Signed, >>>> Ethin D. Probst >>>=20 >>=20 >=20 >=20 >=20 >=20 --Apple-Mail=_C411943F-CDA1-45E8-BB2A-B3625A4AA9FB Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On Apr 16, 2= 021, at 6:22 AM, Marvin H=C3=A4user <mhaeuser@posteo.de> wrote:

Good day,

Sorry for the nitpicking.

- P= rotocols always need a "Revision" field as first member. This is used to be= able to expand its capabilities in later revisions without introducing a n= ew, distinct protocol.
= - Consider the name EFI_SIMPLE_AUDIO_OUTPUT(!)_PROTOCOL, to not cause confu= sion if input is ever added. Input in my opinion should be a separate proto= col as there is no reason why they would necessarily be coupled topology-wi= se (think of an USB microphone, it will never have any sort of output).
- To make code safety a bi= t easier, try to use "CONST" for "IN" (non-OUT) pointers, so that CONST can= be propagated where possible.
- Please do *not* make the events caller-owned. We had it multiple = times already on production firmware that events are left dangling and may = be polled/signaled after ExitBS(). The caller should be able to decide on s= ome policy maybe (i.e. abort or block on ExitBS() until the playback finish= ed), as cut-off audio may be awkward; but the callee definitely should impl= ement "event safety" itself. Maybe avoid exposing events directly at all an= d provide nice abstractions the caller cannot misuse.
- I don't think audio should be required at = all, the required subset should firstly consider minimalism and security. A= ccessibility will not be of concern for some IoT device, the audio code wou= ld simply eat space, and introduce a larger surface for bugs.


Marvin,

Generally how we work this in the UEFI Specification is w= e make it optional via the following wording: =E2=80=9CIf a platform includ= es the ability to play audio in EFI then the EFI_SIMPLE_AUDIO_OUTPUT_PROTOC= OL must be implemented. 

Basically= this requirement will get added to UEFI Specification 2.6.2 Platform-Speci= fic Elements.

Thanks,

Andrew Fish

Best regards,
Marvin

On 16.04.21 01:42, Eth= in Probst wrote:
Hi Andrew,

What would that protocol int= erface look like if we utilized your idea?
With mine (though = I need to add channel mapping as well), your
workflow for pla= ying a stereo sound from left to right would probably
be some= thing like this:
1) Encode the sound using a standard tool in= to a Wave PCM 16.
2) Place the Wave file in the Firmware Volu= me using a given UUID as
the name. As simple as editing the p= latform FDF file.
3) Write some BDS code
 =  a) Lookup Wave file by U= UID and read it into memory.
  b) Decode the audio file (audio devices will not d= o this decoding
for you, you have to do that yourself).
  c) Call E= FI_AUDIO_PROTOCOL.LoadBuffer(), passing in the sample rate
of= your audio, EFI_AUDIO_PROTOCOL_SAMPLE_FORMAT_S16 for signed 16-bit
PCM audio, the channel mapping, the number of samples, and the sampl= es
themselves.
  d) call EFI_BOOT_SER= VICES.CreateEvent()/EFI_BOOT_SERVICES.CreateEventEx()
for a p= layback event to signal.
  e) call EFI_AUDIO_PROTOC= OL.StartPlayback(), passing in the event you
just created.The reason that LoadBuffer() takes so many parameters is becaus= e the
device does not know the audio that your passing in. If= I'm given an
array of 16-bit audio samples, its impossible t= o know the parameters
(sample rate, sample format, channel ma= pping, etc.) from that alone.
Using your idea, though, my pro= tocol could be greatly simplified.
Forcing a particular chann= el mapping, sample rate and sample format on
everyone would c= omplicate application code. From an application point
of view= , one would, with that type of protocol, need to do the
follo= wing:
1) Load an audio file in any audio file format from any= storage mechanism.
2) Decode the audio file format to extrac= t the samples and audio metadata.
3) Resample the (now decode= d) audio samples and convert (quantize) the
audio samples int= o signed 16-bit PCM audio.
4) forward the samples onto the EF= I audio protocol.
There is another option. (I'm happy we're d= iscussing this now -- we
can hammer out all the details now w= hich will make a lot of things
easier.) Since I'll most likel= y end up splitting the device-specific
interfaces to differen= t audio protocols, we could make a simple audio
protocol that= makes various assumptions about the audio samples your
givin= g it (e.g.: sample rate, format, ...). This would just allow
= audio output and input in signed 16-bit PCM audio, as you've
= suggested, and would be a simple and easy to use interface. Something
like:
typedef struct EFI_SIMPLE_AUDIO_PROTOCOL {
  EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset;
  EFI_SIMPLE_AUDIO_PROTOCOL_START Start;
 &nb= sp;EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop;
} EFI_SIMPLE_AUDIO_PR= OTOCOL;
This way, users and driver developers have a simple a= udio protocol
they can implement if they like. It would assum= e signed 16-bit PCM
audio and mono channel mappings at 44100 = Hz. Then, we can have an
advanced protocol for each device ty= pe (HDA, USB, VirtIO, ...) that
exposes all the knobs for sam= ple formats, sample rates, that kind of
thing. Obviously, lik= e the majority (if not all) UEFI protocols, these
advanced pr= otocols would be optional. I think, however, that the
simple = audio protocol should be a required protocol in all UEFI
impl= ementations. But that might not be possible. So would this simpler
interface work as a starting point?

On = 4/15/21, Andrew Fish <afis= h@apple.com> wrote:

On Apr 15, 2021, at = 1:11 PM, Ethin Probst <harlydavidsen@gmail.com>
wrote:
<= br class=3D"">
Is there any necessity f= or audio input and output to be implemented
within the same p= rotocol?  Unlike a network device (which is
intrinsicall= y bidirectional), it seems natural to conceptually separate
a= udio input from audio output.
Nope, there isn't = a necessity to make them in one, they can be
separated into t= wo.

The c= ode controlling volume/mute may not have any access to the sample
buffer.  The most natural implementation would seem to allow fo= r a
platform to notice volume up/down keypresses and use thos= e to control the
overall system volume, without any knowledge= of which samples (if any)
are currently being played by othe= r code in the system.
Your assuming that the aud= io device your implementing the
volume/muting has volume cont= rol and muting functionality within
itself, then.
Not really. We are assuming that audio hardware has a bette= r understanding
of how that system implements volume than som= e generic EFI Code that is by
definition platform agnostic.
This may n= ot be the case, and so we'd need to
effectively simulate it w= ithin the driver, which isn't too hard to do.
As an example, = the VirtIO driver does not have a request type for
muting or = for volume control (this would, most likely, be within the
VI= RTIO_SND_R_PCM_SET_PARAMS request, see sec. 5.14.6.4.3). Therefore,
either the driver would have to simulate the request or return
EFI_UNSUPPORTED this instance.

So this is an example of above since the audio hardware knows it is = routing
the sound output into another subsystem, and that sub= system controls the
volume. So the VirtIo Sound Driver know b= est how to bstract volume/mute for
this platform.

Consider also the point of view of the developer implement= ing a driver
for some other piece of audio hardware that happ= ens not to support
precisely the same sample rates etc as Vir= tIO.  It would be extremely
ugly to force all future har= dware to pretend to have the same
capabilities as VirtIO just= because the API was initially designed with
VirtIO in mind.<= br class=3D"">
Precisely, but the brilliance of VirtIO
The brilliance of VirtIO is that it just needs to imple= ment a generic device
driver for a given operating system. In= most cases these operating systems
have sounds subsystems th= at manage sound and want fine granularity of
control on what = is going on. So the drivers are implemented to maximizes
flex= ibility since the OS has lots of generic code that deals with sound, andeven user configurable knobs to control audio. In our case that= extra layer
does not exist in EFI and the end user code just= want to tell the driver do
some simple things.

Maybe it is easier to think about with an example. Lets say= I want to play a
cool sound on every boot. What would be the= workflow to make the happen.
1) Encode the sound using a sta= ndard tool into a Wave PCM 16.
2) Place the Wave file in the = Firmware Volume using a given UUID as the
name. As simple as = editing the platform FDF file.
3) Write some BDS code
  a) Lookup Wave file by UUID and read it into memory.
  b) Point the EFI Sound Protocol at the buffer with t= he Wave file
  c) Tell the EFI Sound Protocol to pl= ay the sound.

If you start adding in a lot of = perimeters that work flow starts getting
really complicated r= eally quickly.

is that the sample rate,
sample format, &c., do no= t have to all be supported by a VirtIO
device. Notice, also, = how in my protocol proposal I noted that the
sample rates, at= least, were "recommended," not "required." Should a
device n= ot happen to support a sample rate or sample format, all it
n= eeds to do is return EFI_INVALID_PARAMETER. Section 5.14.6.2.1
(VIRTIO_SND_R_JACK_GET_CONFIG) describes how a jack tells you what
sample rates it supports, channel mappings, &c.
<= br class=3D"">I do understand how just using a predefined sample rate and s= ample
format might be a good idea, and if that's the best way= , then that's
what we'll do. The protocol can always be revis= ed at a later time if
necessary. I apologize if my stance see= ms obstinate.

I think if we add t= he version into the protocol and make sure we have a
separate= load and play operation we could add a member to set the extra
perimeters if needed. There might also be some platform specific generic=
tunables that might be interesting for a future member funct= ion.

Thanks,

Andr= ew Fish

A= lso, thank you, Laszlo, for your advice -- I hadn't considered that a
network driver would be another good way of figuring out how async=
works in UEFI.

On 4/15/21, Andr= ew Fish <afish@apple.com> wrote:

On Apr 15, 2021, at 5:07 AM, Mi= chael Brown <mcb30@ipxe.org= > wrote:

On 15/04/2021 06:28, Ethin Pro= bst wrote:
- I hoped to = add recording in case we in future want to add
accessibility = aids like speech recognition (that was one of the todo
tasks = on the EDK2 tasks list)
Is there any necessity f= or audio input and output to be implemented
within
the same protocol?  Unlike a network device (which is intrinsic= ally
bidirectional), it seems natural to conceptually separat= e audio input
from
audio output.
=
- Muting and volume con= trol could easily be added by just replacing
the sample buffe= r with silence and by multiplying all the samples by a
percen= tage.
The code controlling volume/mute may not h= ave any access to the sample
buffer.  The most natural i= mplementation would seem to allow for a
platform to notice vo= lume up/down keypresses and use those to control
the
overall system volume, without any knowledge of which samples (if an= y)
are
currently being played by other code in = the system.

I=E2=80=99ve also tho= ught of adding NVRAM variable that would let setup, UEFI
Shel= l,
or even the OS set the current volume, and Mute. This how = it would be
consumed concept is why I proposed mute and volum= e being separate APIs.
The
volume up/down API i= n addition to fixed percentage might be overkill, but
it
does allow a non liner mapping to the volume up/down keys. You wo= uld be
surprised how picky audiophiles can be and it seems th= ey like to file
Bugzillas.

- Fina= lly, the reason I used enumerations for specifying parameters
like sample rate and stuff was that I was looking at this protocol
from a general UEFI applications point of view. VirtIO supports all = of
the sample configurations listed in my gist, and it seems = reasonable
to allow the application to control those paramete= rs instead of
forcing a particular parameter configuration on= to the developer.
Consider also the point of vie= w of the developer implementing a driver
for
so= me other piece of audio hardware that happens not to support
= precisely
the same sample rates etc as VirtIO.  It would= be extremely ugly to
force
all future hardware= to pretend to have the same capabilities as VirtIO
just beca= use the API was initially designed with VirtIO in mind.

As a developer on the other side of the API, writing code to play= sound
files on an arbitrary unknown platform, I would prefer= to simply
consume
as simple as possible an abs= traction of an audio output protocol and
not
ha= ve to care about what hardware is actually implementing it.
<= br class=3D"">
It may make sense to have an API to load the buf= fer/stream and other APIs
to
play or pause. Thi= s could allow an optional API to configure how the
stream
is played back. If we add a version to the Protocol that would a= t least
future proof us.

We did = get feedback that it is very common to speed up the auto playback
rates for accessibility. I=E2=80=99m not sure if that is practical w= ith a simple
PCM
16 wave file with the firmware= audio implementation. I guess that is
something we could inv= estigate.

In terms of maybe adding text to spe= ech there is an open source project
that
concep= tually we could port to EFI. It would likely be a binary that
would
have to live on the EFI System Partition due to size. = I was thinking
that
words per minute could be p= art of that API and it would produce a PCM 16
wave file that = the audio protocol we are discussing could play.

Both of these argue in favour of de= fining a very simple API that
expresses
only a = common baseline capability that is plausibly implementable for
every piece of audio hardware ever made.

Cou= pled with the relatively minimalistic requirements for boot-time
audio,
I'd probably suggest supporting only a single f= ormat for audio data,
with
a fixed sample rate = (and possibly only mono output).

= In my world the folks that work for Jony asked for a stereo boot bong totransition from left to right :). This is not the CODEC you are= looking
for
was our response=E2=80=A6. I also = did not mention that some languages are right
to
left, as the only thing worse than one complex thing is two complex
things
to implement.

As always: perfection is achieved, not w= hen there is nothing more to
add,
but when ther= e is nothing left to take away.  :)

"Simplicity is the ultimate sophistication=E2=80=9D
<= br class=3D"">Thanks,

Andrew Fish

Thanks,

Michael







--
Signed,
Ethin D. Probst






--Apple-Mail=_C411943F-CDA1-45E8-BB2A-B3625A4AA9FB--