From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mout02.posteo.de (mout02.posteo.de [185.67.36.66]) by mx.groups.io with SMTP id smtpd.web09.8777.1618579346457456823 for ; Fri, 16 Apr 2021 06:22:27 -0700 Authentication-Results: mx.groups.io; dkim=fail reason="body hash did not verify" header.i=@posteo.de header.s=2017 header.b=JJ2TzocK; spf=pass (domain: posteo.de, ip: 185.67.36.66, mailfrom: mhaeuser@posteo.de) Received: from submission (posteo.de [89.146.220.130]) by mout02.posteo.de (Postfix) with ESMTPS id 163EF240108 for ; Fri, 16 Apr 2021 15:22:22 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.de; s=2017; t=1618579343; bh=A6s4OXCasmRIr0UHKVSOnmwA8JHp+XOHGLxrRlb1P1c=; h=Subject:To:Cc:From:Date:From; b=JJ2TzocK63tJbWAvqOBZlIkLRXeiaU8WV6BO2WL7xN59TD2rS+ZBVGwwgmOnHeOoU DhHrCDKd1hOyDq93NkPFxzKqEtu2ACvCHf0fSuhfyzklp6ZrYEBuLq/wYfti7T1JYI ZU6pebzdLOTaA1c8rfx7322RK+UrdJwaxQvPJCzdW/1jIC+kmZ1WYf0H22BYPmSf9N wl2+8ruLmJl3H58ig7g6vYg7s0rUM9kBm4vE3R2j73JIW0CsnO6hIx/QPC+cg8bTup VhLbQutfBi8bykRLMpMz7wUVdqiGQnzFQF4x0/OjptlG+rcpUQRYlVJbFNUaV5hzUf LOBwnlVAGdI/A== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4FMH1m1fkyz6tmj; Fri, 16 Apr 2021 15:22:19 +0200 (CEST) Subject: Re: [edk2-devel] VirtIO Sound Driver (GSoC 2021) To: devel@edk2.groups.io, harlydavidsen@gmail.com, Andrew Fish Cc: Michael Brown , Mike Kinney , Leif Lindholm , Laszlo Ersek , "Desimone, Nathaniel L" , Rafael Rodrigues Machado , Gerd Hoffmann References: <16758FB6436B1195.32393@groups.io> <4AEC1784-99AF-47EF-B7DD-77F91EA3D7E9@apple.com> <309cc5ca-2ecd-79dd-b183-eec0572ea982@ipxe.org> From: =?UTF-8?B?TWFydmluIEjDpHVzZXI=?= Message-ID: <406e5bdb-f6aa-21ce-c96a-b16fb07c181d@posteo.de> Date: Fri, 16 Apr 2021 13:22:14 +0000 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: quoted-printable Good day, Sorry for the nitpicking. - Protocols always need a "Revision" field as first member. This is used=20 to be able to expand its capabilities in later revisions without=20 introducing a new, distinct protocol. - Consider the name EFI_SIMPLE_AUDIO_OUTPUT(!)_PROTOCOL, to not cause=20 confusion if input is ever added. Input in my opinion should be a=20 separate protocol as there is no reason why they would necessarily be=20 coupled topology-wise (think of an USB microphone, it will never have=20 any sort of output). - To make code safety a bit easier, try to use "CONST" for "IN"=20 (non-OUT) pointers, so that CONST can be propagated where possible. - Please do *not* make the events caller-owned. We had it multiple times=20 already on production firmware that events are left dangling and may be=20 polled/signaled after ExitBS(). The caller should be able to decide on=20 some policy maybe (i.e. abort or block on ExitBS() until the playback=20 finished), as cut-off audio may be awkward; but the callee definitely=20 should implement "event safety" itself. Maybe avoid exposing events=20 directly at all and provide nice abstractions the caller cannot misuse. - I don't think audio should be required at all, the required subset=20 should firstly consider minimalism and security. Accessibility will not=20 be of concern for some IoT device, the audio code would simply eat=20 space, and introduce a larger surface for bugs. Best regards, Marvin On 16.04.21 01:42, Ethin Probst wrote: > Hi Andrew, > > What would that protocol interface look like if we utilized your idea? > With mine (though I need to add channel mapping as well), your > workflow for playing a stereo sound from left to right would probably > be something like this: > 1) Encode the sound using a standard tool into a Wave PCM 16. > 2) Place the Wave file in the Firmware Volume using a given UUID as > the name. As simple as editing the platform FDF file. > 3) Write some BDS code > =C2=A0 a) Lookup Wave file by UUID and read it into memory. > =C2=A0 b) Decode the audio file (audio devices will not do this decodi= ng > for you, you have to do that yourself). > =C2=A0 c) Call EFI_AUDIO_PROTOCOL.LoadBuffer(), passing in the sample = rate > of your audio, EFI_AUDIO_PROTOCOL_SAMPLE_FORMAT_S16 for signed 16-bit > PCM audio, the channel mapping, the number of samples, and the samples > themselves. > d) call EFI_BOOT_SERVICES.CreateEvent()/EFI_BOOT_SERVICES.CreateEven= tEx() > for a playback event to signal. > e) call EFI_AUDIO_PROTOCOL.StartPlayback(), passing in the event you > just created. > The reason that LoadBuffer() takes so many parameters is because the > device does not know the audio that your passing in. If I'm given an > array of 16-bit audio samples, its impossible to know the parameters > (sample rate, sample format, channel mapping, etc.) from that alone. > Using your idea, though, my protocol could be greatly simplified. > Forcing a particular channel mapping, sample rate and sample format on > everyone would complicate application code. From an application point > of view, one would, with that type of protocol, need to do the > following: > 1) Load an audio file in any audio file format from any storage mechani= sm. > 2) Decode the audio file format to extract the samples and audio metada= ta. > 3) Resample the (now decoded) audio samples and convert (quantize) the > audio samples into signed 16-bit PCM audio. > 4) forward the samples onto the EFI audio protocol. > There is another option. (I'm happy we're discussing this now -- we > can hammer out all the details now which will make a lot of things > easier.) Since I'll most likely end up splitting the device-specific > interfaces to different audio protocols, we could make a simple audio > protocol that makes various assumptions about the audio samples your > giving it (e.g.: sample rate, format, ...). This would just allow > audio output and input in signed 16-bit PCM audio, as you've > suggested, and would be a simple and easy to use interface. Something > like: > typedef struct EFI_SIMPLE_AUDIO_PROTOCOL { > EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset; > EFI_SIMPLE_AUDIO_PROTOCOL_START Start; > EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop; > } EFI_SIMPLE_AUDIO_PROTOCOL; > This way, users and driver developers have a simple audio protocol > they can implement if they like. It would assume signed 16-bit PCM > audio and mono channel mappings at 44100 Hz. Then, we can have an > advanced protocol for each device type (HDA, USB, VirtIO, ...) that > exposes all the knobs for sample formats, sample rates, that kind of > thing. Obviously, like the majority (if not all) UEFI protocols, these > advanced protocols would be optional. I think, however, that the > simple audio protocol should be a required protocol in all UEFI > implementations. But that might not be possible. So would this simpler > interface work as a starting point? > > On 4/15/21, Andrew Fish wrote: >> >>> On Apr 15, 2021, at 1:11 PM, Ethin Probst >>> wrote: >>> >>>> Is there any necessity for audio input and output to be implemented >>>> within the same protocol? Unlike a network device (which is >>>> intrinsically bidirectional), it seems natural to conceptually separ= ate >>>> audio input from audio output. >>> Nope, there isn't a necessity to make them in one, they can be >>> separated into two. >>> >>>> The code controlling volume/mute may not have any access to the samp= le >>>> buffer. The most natural implementation would seem to allow for a >>>> platform to notice volume up/down keypresses and use those to contro= l the >>>> overall system volume, without any knowledge of which samples (if an= y) >>>> are currently being played by other code in the system. >>> Your assuming that the audio device your implementing the >>> volume/muting has volume control and muting functionality within >>> itself, then. >> Not really. We are assuming that audio hardware has a better understan= ding >> of how that system implements volume than some generic EFI Code that i= s by >> definition platform agnostic. >> >>> This may not be the case, and so we'd need to >>> effectively simulate it within the driver, which isn't too hard to do= . >>> As an example, the VirtIO driver does not have a request type for >>> muting or for volume control (this would, most likely, be within the >>> VIRTIO_SND_R_PCM_SET_PARAMS request, see sec. 5.14.6.4.3). Therefore, >>> either the driver would have to simulate the request or return >>> EFI_UNSUPPORTED this instance. >>> >> So this is an example of above since the audio hardware knows it is ro= uting >> the sound output into another subsystem, and that subsystem controls t= he >> volume. So the VirtIo Sound Driver know best how to bstract volume/mut= e for >> this platform. >> >>>> Consider also the point of view of the developer implementing a driv= er >>>> for some other piece of audio hardware that happens not to support >>>> precisely the same sample rates etc as VirtIO. It would be extremel= y >>>> ugly to force all future hardware to pretend to have the same >>>> capabilities as VirtIO just because the API was initially designed w= ith >>>> VirtIO in mind. >>> Precisely, but the brilliance of VirtIO >> The brilliance of VirtIO is that it just needs to implement a generic = device >> driver for a given operating system. In most cases these operating sys= tems >> have sounds subsystems that manage sound and want fine granularity of >> control on what is going on. So the drivers are implemented to maximiz= es >> flexibility since the OS has lots of generic code that deals with soun= d, and >> even user configurable knobs to control audio. In our case that extra = layer >> does not exist in EFI and the end user code just want to tell the driv= er do >> some simple things. >> >> Maybe it is easier to think about with an example. Lets say I want to = play a >> cool sound on every boot. What would be the workflow to make the happe= n. >> 1) Encode the sound using a standard tool into a Wave PCM 16. >> 2) Place the Wave file in the Firmware Volume using a given UUID as th= e >> name. As simple as editing the platform FDF file. >> 3) Write some BDS code >> a) Lookup Wave file by UUID and read it into memory. >> b) Point the EFI Sound Protocol at the buffer with the Wave file >> c) Tell the EFI Sound Protocol to play the sound. >> >> If you start adding in a lot of perimeters that work flow starts getti= ng >> really complicated really quickly. >> >>> is that the sample rate, >>> sample format, &c., do not have to all be supported by a VirtIO >>> device. Notice, also, how in my protocol proposal I noted that the >>> sample rates, at least, were "recommended," not "required." Should a >>> device not happen to support a sample rate or sample format, all it >>> needs to do is return EFI_INVALID_PARAMETER. Section 5.14.6.2.1 >>> (VIRTIO_SND_R_JACK_GET_CONFIG) describes how a jack tells you what >>> sample rates it supports, channel mappings, &c. >>> >>> I do understand how just using a predefined sample rate and sample >>> format might be a good idea, and if that's the best way, then that's >>> what we'll do. The protocol can always be revised at a later time if >>> necessary. I apologize if my stance seems obstinate. >>> >> I think if we add the version into the protocol and make sure we have = a >> separate load and play operation we could add a member to set the extr= a >> perimeters if needed. There might also be some platform specific gener= ic >> tunables that might be interesting for a future member function. >> >> Thanks, >> >> Andrew Fish >> >>> Also, thank you, Laszlo, for your advice -- I hadn't considered that = a >>> network driver would be another good way of figuring out how async >>> works in UEFI. >>> >>> On 4/15/21, Andrew Fish wrote: >>>> >>>>> On Apr 15, 2021, at 5:07 AM, Michael Brown wrote: >>>>> >>>>> On 15/04/2021 06:28, Ethin Probst wrote: >>>>>> - I hoped to add recording in case we in future want to add >>>>>> accessibility aids like speech recognition (that was one of the to= do >>>>>> tasks on the EDK2 tasks list) >>>>> Is there any necessity for audio input and output to be implemented >>>>> within >>>>> the same protocol? Unlike a network device (which is intrinsically >>>>> bidirectional), it seems natural to conceptually separate audio inp= ut >>>>> from >>>>> audio output. >>>>> >>>>>> - Muting and volume control could easily be added by just replacin= g >>>>>> the sample buffer with silence and by multiplying all the samples = by a >>>>>> percentage. >>>>> The code controlling volume/mute may not have any access to the sam= ple >>>>> buffer. The most natural implementation would seem to allow for a >>>>> platform to notice volume up/down keypresses and use those to contr= ol >>>>> the >>>>> overall system volume, without any knowledge of which samples (if a= ny) >>>>> are >>>>> currently being played by other code in the system. >>>>> >>>> I=E2=80=99ve also thought of adding NVRAM variable that would let se= tup, UEFI >>>> Shell, >>>> or even the OS set the current volume, and Mute. This how it would b= e >>>> consumed concept is why I proposed mute and volume being separate AP= Is. >>>> The >>>> volume up/down API in addition to fixed percentage might be overkill= , but >>>> it >>>> does allow a non liner mapping to the volume up/down keys. You would= be >>>> surprised how picky audiophiles can be and it seems they like to fil= e >>>> Bugzillas. >>>> >>>>>> - Finally, the reason I used enumerations for specifying parameter= s >>>>>> like sample rate and stuff was that I was looking at this protocol >>>>>> from a general UEFI applications point of view. VirtIO supports al= l of >>>>>> the sample configurations listed in my gist, and it seems reasonab= le >>>>>> to allow the application to control those parameters instead of >>>>>> forcing a particular parameter configuration onto the developer. >>>>> Consider also the point of view of the developer implementing a dri= ver >>>>> for >>>>> some other piece of audio hardware that happens not to support >>>>> precisely >>>>> the same sample rates etc as VirtIO. It would be extremely ugly to >>>>> force >>>>> all future hardware to pretend to have the same capabilities as Vir= tIO >>>>> just because the API was initially designed with VirtIO in mind. >>>>> >>>>> As a developer on the other side of the API, writing code to play s= ound >>>>> files on an arbitrary unknown platform, I would prefer to simply >>>>> consume >>>>> as simple as possible an abstraction of an audio output protocol an= d >>>>> not >>>>> have to care about what hardware is actually implementing it. >>>>> >>>> It may make sense to have an API to load the buffer/stream and other= APIs >>>> to >>>> play or pause. This could allow an optional API to configure how the >>>> stream >>>> is played back. If we add a version to the Protocol that would at le= ast >>>> future proof us. >>>> >>>> We did get feedback that it is very common to speed up the auto play= back >>>> rates for accessibility. I=E2=80=99m not sure if that is practical w= ith a simple >>>> PCM >>>> 16 wave file with the firmware audio implementation. I guess that is >>>> something we could investigate. >>>> >>>> In terms of maybe adding text to speech there is an open source proj= ect >>>> that >>>> conceptually we could port to EFI. It would likely be a binary that >>>> would >>>> have to live on the EFI System Partition due to size. I was thinking >>>> that >>>> words per minute could be part of that API and it would produce a PC= M 16 >>>> wave file that the audio protocol we are discussing could play. >>>> >>>>> Both of these argue in favour of defining a very simple API that >>>>> expresses >>>>> only a common baseline capability that is plausibly implementable f= or >>>>> every piece of audio hardware ever made. >>>>> >>>>> Coupled with the relatively minimalistic requirements for boot-time >>>>> audio, >>>>> I'd probably suggest supporting only a single format for audio data= , >>>>> with >>>>> a fixed sample rate (and possibly only mono output). >>>>> >>>> In my world the folks that work for Jony asked for a stereo boot bon= g to >>>> transition from left to right :). This is not the CODEC you are look= ing >>>> for >>>> was our response=E2=80=A6. I also did not mention that some language= s are right >>>> to >>>> left, as the only thing worse than one complex thing is two complex >>>> things >>>> to implement. >>>> >>>>> As always: perfection is achieved, not when there is nothing more t= o >>>>> add, >>>>> but when there is nothing left to take away. :) >>>>> >>>> "Simplicity is the ultimate sophistication=E2=80=9D >>>> >>>> Thanks, >>>> >>>> Andrew Fish >>>> >>>>> Thanks, >>>>> >>>>> Michael >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>> >>> -- >>> Signed, >>> Ethin D. Probst >> >