From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ej1-f48.google.com (mail-ej1-f48.google.com [209.85.218.48]) by mx.groups.io with SMTP id smtpd.web12.2954.1618534763525605997 for ; Thu, 15 Apr 2021 17:59:23 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Qp0yPuAY; spf=pass (domain: gmail.com, ip: 209.85.218.48, mailfrom: harlydavidsen@gmail.com) Received: by mail-ej1-f48.google.com with SMTP id x12so18847621ejc.1 for ; Thu, 15 Apr 2021 17:59:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=GmkOGL4CypEpeM2Ds7NSnal1tK2EnwBOkGiQE4QM1fo=; b=Qp0yPuAYKtPj3bYJv46W7iTJWymeApSexAdaoL/FFel6krZV/XMARTvaCHsnVTdyoz EU4pRKugzGeQHsAzA3LGc1Mq4EO2ExW8qyqr759FDxcNCpDE70CMgHZwhpfkHEdxqfYZ rm3Ongmb9dbu6miipVOFmNg28mSp0U6oEFexlmmjRQ6OARVqv0VpYHWcTPFSy/ISBtNd SeeuuBx7rwXZcEILpRfvZ77nZXTC12JQ4lfNZAAK5NnxdjWFFoc40J05aogr3GBbzIJF AU/BuaEujm+C0OmejMQIW1EphrcIfPObpvWD6ZIhFrq6jLl2K/AwoJv3mQO2g+F/aPfz hiJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=GmkOGL4CypEpeM2Ds7NSnal1tK2EnwBOkGiQE4QM1fo=; b=dsKySLlocPFEdmMYs/h6Fr5ZqAz9PSHXQvQCEO/14hhXQNAfhN9YHxv5v9vVQgZFv/ pTKRuag9nte6NWxk1XNpm7gJ+igl6Zk+Pam5OSVw7YIjyQQX++7DXKTZeYi4l+izqhgj +Rm8sXu9/X0HfPISsfRiC/5WeH4iYzoIYtM02Z+pdquBgjZJ0soBkUxuC/GFV4/hvUcS aPVDgvX08aO26LLYQ110Lqi/XZ8yn7VqPflxbLKshN8m/S/I4FR7pLNqM9nWaF5JNK+5 +VewDrKoR0reteVyFX/UcmrwZdNOS3LqPnNFe5cwFdqNivTI1yoVeonNp9wGqJTvF1Xe fVKA== X-Gm-Message-State: AOAM530AsIhpLIU3r2/ZgiF9rif2pX0Bm5DioXEfb5t9+IR3YJP/qCOq gBX2woFvwGn1o2YKA+g0JNCytUtv4tdp6P7RZbN0FSnP7hSdrA== X-Google-Smtp-Source: ABdhPJwr47tlg6P+iQhfVBQIeLdJk9GYMLMpv4qadFb6byZQPEKib56H6Whd4hbt2KJFZoaWBr+gwS2whpFvOMRHazU= X-Received: by 2002:a17:906:6a15:: with SMTP id o21mr6046683ejr.40.1618534761639; Thu, 15 Apr 2021 17:59:21 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:ab4:9a0a:0:0:0:0:0 with HTTP; Thu, 15 Apr 2021 17:59:21 -0700 (PDT) In-Reply-To: <16762C957671127A.12361@groups.io> References: <66e073bb-366b-0559-4a78-fc5e8215aca1@redhat.com> <16758FB6436B1195.32393@groups.io> <4AEC1784-99AF-47EF-B7DD-77F91EA3D7E9@apple.com> <309cc5ca-2ecd-79dd-b183-eec0572ea982@ipxe.org> <16762C957671127A.12361@groups.io> From: "Ethin Probst" Date: Thu, 15 Apr 2021 19:59:21 -0500 Message-ID: Subject: Re: [edk2-devel] VirtIO Sound Driver (GSoC 2021) To: devel@edk2.groups.io, harlydavidsen@gmail.com Cc: Andrew Fish , Michael Brown , Mike Kinney , Leif Lindholm , Laszlo Ersek , "Desimone, Nathaniel L" , Rafael Rodrigues Machado , Gerd Hoffmann Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Also, I'm a bit confused. I've looked at several VirtIO devices now and have seen things like this: #define VIRTIO_PCI_DEVICE_SIGNATURE SIGNATURE_32 ('V', 'P', 'C', 'I') // ... UINT32 Signature; I'm quite confused because I can't seem to find this anywhere in the VirtIO specification. The spec says nothing about signature values anywhere, just the magic value of 0x74726976. So where does this come from? On 4/15/21, Ethin Probst via groups.io wrote: > Hi Andrew, > > What would that protocol interface look like if we utilized your idea? > With mine (though I need to add channel mapping as well), your > workflow for playing a stereo sound from left to right would probably > be something like this: > 1) Encode the sound using a standard tool into a Wave PCM 16. > 2) Place the Wave file in the Firmware Volume using a given UUID as > the name. As simple as editing the platform FDF file. > 3) Write some BDS code > =C2=A0 a) Lookup Wave file by UUID and read it into memory. > =C2=A0 b) Decode the audio file (audio devices will not do this decoding > for you, you have to do that yourself). > =C2=A0 c) Call EFI_AUDIO_PROTOCOL.LoadBuffer(), passing in the sample ra= te > of your audio, EFI_AUDIO_PROTOCOL_SAMPLE_FORMAT_S16 for signed 16-bit > PCM audio, the channel mapping, the number of samples, and the samples > themselves. > d) call EFI_BOOT_SERVICES.CreateEvent()/EFI_BOOT_SERVICES.CreateEventE= x() > for a playback event to signal. > e) call EFI_AUDIO_PROTOCOL.StartPlayback(), passing in the event you > just created. > The reason that LoadBuffer() takes so many parameters is because the > device does not know the audio that your passing in. If I'm given an > array of 16-bit audio samples, its impossible to know the parameters > (sample rate, sample format, channel mapping, etc.) from that alone. > Using your idea, though, my protocol could be greatly simplified. > Forcing a particular channel mapping, sample rate and sample format on > everyone would complicate application code. From an application point > of view, one would, with that type of protocol, need to do the > following: > 1) Load an audio file in any audio file format from any storage mechanis= m. > 2) Decode the audio file format to extract the samples and audio metadat= a. > 3) Resample the (now decoded) audio samples and convert (quantize) the > audio samples into signed 16-bit PCM audio. > 4) forward the samples onto the EFI audio protocol. > There is another option. (I'm happy we're discussing this now -- we > can hammer out all the details now which will make a lot of things > easier.) Since I'll most likely end up splitting the device-specific > interfaces to different audio protocols, we could make a simple audio > protocol that makes various assumptions about the audio samples your > giving it (e.g.: sample rate, format, ...). This would just allow > audio output and input in signed 16-bit PCM audio, as you've > suggested, and would be a simple and easy to use interface. Something > like: > typedef struct EFI_SIMPLE_AUDIO_PROTOCOL { > EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset; > EFI_SIMPLE_AUDIO_PROTOCOL_START Start; > EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop; > } EFI_SIMPLE_AUDIO_PROTOCOL; > This way, users and driver developers have a simple audio protocol > they can implement if they like. It would assume signed 16-bit PCM > audio and mono channel mappings at 44100 Hz. Then, we can have an > advanced protocol for each device type (HDA, USB, VirtIO, ...) that > exposes all the knobs for sample formats, sample rates, that kind of > thing. Obviously, like the majority (if not all) UEFI protocols, these > advanced protocols would be optional. I think, however, that the > simple audio protocol should be a required protocol in all UEFI > implementations. But that might not be possible. So would this simpler > interface work as a starting point? > > On 4/15/21, Andrew Fish wrote: >> >> >>> On Apr 15, 2021, at 1:11 PM, Ethin Probst >>> wrote: >>> >>>> Is there any necessity for audio input and output to be implemented >>>> within the same protocol? Unlike a network device (which is >>>> intrinsically bidirectional), it seems natural to conceptually separa= te >>>> audio input from audio output. >>> >>> Nope, there isn't a necessity to make them in one, they can be >>> separated into two. >>> >>>> The code controlling volume/mute may not have any access to the sampl= e >>>> buffer. The most natural implementation would seem to allow for a >>>> platform to notice volume up/down keypresses and use those to control >>>> the >>>> overall system volume, without any knowledge of which samples (if any= ) >>>> are currently being played by other code in the system. >>> >>> Your assuming that the audio device your implementing the >>> volume/muting has volume control and muting functionality within >>> itself, then. >> >> Not really. We are assuming that audio hardware has a better >> understanding >> of how that system implements volume than some generic EFI Code that is >> by >> definition platform agnostic. >> >>> This may not be the case, and so we'd need to >>> effectively simulate it within the driver, which isn't too hard to do. >>> As an example, the VirtIO driver does not have a request type for >>> muting or for volume control (this would, most likely, be within the >>> VIRTIO_SND_R_PCM_SET_PARAMS request, see sec. 5.14.6.4.3). Therefore, >>> either the driver would have to simulate the request or return >>> EFI_UNSUPPORTED this instance. >>> >> >> So this is an example of above since the audio hardware knows it is >> routing >> the sound output into another subsystem, and that subsystem controls th= e >> volume. So the VirtIo Sound Driver know best how to bstract volume/mute >> for >> this platform. >> >>>> Consider also the point of view of the developer implementing a drive= r >>>> for some other piece of audio hardware that happens not to support >>>> precisely the same sample rates etc as VirtIO. It would be extremely >>>> ugly to force all future hardware to pretend to have the same >>>> capabilities as VirtIO just because the API was initially designed wi= th >>>> VirtIO in mind. >>> >>> Precisely, but the brilliance of VirtIO >> >> The brilliance of VirtIO is that it just needs to implement a generic >> device >> driver for a given operating system. In most cases these operating >> systems >> have sounds subsystems that manage sound and want fine granularity of >> control on what is going on. So the drivers are implemented to maximize= s >> flexibility since the OS has lots of generic code that deals with sound= , >> and >> even user configurable knobs to control audio. In our case that extra >> layer >> does not exist in EFI and the end user code just want to tell the drive= r >> do >> some simple things. >> >> Maybe it is easier to think about with an example. Lets say I want to p= lay >> a >> cool sound on every boot. What would be the workflow to make the happen= . >> 1) Encode the sound using a standard tool into a Wave PCM 16. >> 2) Place the Wave file in the Firmware Volume using a given UUID as the >> name. As simple as editing the platform FDF file. >> 3) Write some BDS code >> a) Lookup Wave file by UUID and read it into memory. >> b) Point the EFI Sound Protocol at the buffer with the Wave file >> c) Tell the EFI Sound Protocol to play the sound. >> >> If you start adding in a lot of perimeters that work flow starts gettin= g >> really complicated really quickly. >> >>> is that the sample rate, >>> sample format, &c., do not have to all be supported by a VirtIO >>> device. Notice, also, how in my protocol proposal I noted that the >>> sample rates, at least, were "recommended," not "required." Should a >>> device not happen to support a sample rate or sample format, all it >>> needs to do is return EFI_INVALID_PARAMETER. Section 5.14.6.2.1 >>> (VIRTIO_SND_R_JACK_GET_CONFIG) describes how a jack tells you what >>> sample rates it supports, channel mappings, &c. >>> >>> I do understand how just using a predefined sample rate and sample >>> format might be a good idea, and if that's the best way, then that's >>> what we'll do. The protocol can always be revised at a later time if >>> necessary. I apologize if my stance seems obstinate. >>> >> >> I think if we add the version into the protocol and make sure we have a >> separate load and play operation we could add a member to set the extra >> perimeters if needed. There might also be some platform specific generi= c >> tunables that might be interesting for a future member function. >> >> Thanks, >> >> Andrew Fish >> >>> Also, thank you, Laszlo, for your advice -- I hadn't considered that a >>> network driver would be another good way of figuring out how async >>> works in UEFI. >>> >>> On 4/15/21, Andrew Fish wrote: >>>> >>>> >>>>> On Apr 15, 2021, at 5:07 AM, Michael Brown wrote: >>>>> >>>>> On 15/04/2021 06:28, Ethin Probst wrote: >>>>>> - I hoped to add recording in case we in future want to add >>>>>> accessibility aids like speech recognition (that was one of the tod= o >>>>>> tasks on the EDK2 tasks list) >>>>> >>>>> Is there any necessity for audio input and output to be implemented >>>>> within >>>>> the same protocol? Unlike a network device (which is intrinsically >>>>> bidirectional), it seems natural to conceptually separate audio inpu= t >>>>> from >>>>> audio output. >>>>> >>>>>> - Muting and volume control could easily be added by just replacing >>>>>> the sample buffer with silence and by multiplying all the samples b= y >>>>>> a >>>>>> percentage. >>>>> >>>>> The code controlling volume/mute may not have any access to the samp= le >>>>> buffer. The most natural implementation would seem to allow for a >>>>> platform to notice volume up/down keypresses and use those to contro= l >>>>> the >>>>> overall system volume, without any knowledge of which samples (if an= y) >>>>> are >>>>> currently being played by other code in the system. >>>>> >>>> >>>> I=E2=80=99ve also thought of adding NVRAM variable that would let set= up, UEFI >>>> Shell, >>>> or even the OS set the current volume, and Mute. This how it would be >>>> consumed concept is why I proposed mute and volume being separate API= s. >>>> The >>>> volume up/down API in addition to fixed percentage might be overkill, >>>> but >>>> it >>>> does allow a non liner mapping to the volume up/down keys. You would = be >>>> surprised how picky audiophiles can be and it seems they like to file >>>> Bugzillas. >>>> >>>>>> - Finally, the reason I used enumerations for specifying parameters >>>>>> like sample rate and stuff was that I was looking at this protocol >>>>>> from a general UEFI applications point of view. VirtIO supports all >>>>>> of >>>>>> the sample configurations listed in my gist, and it seems reasonabl= e >>>>>> to allow the application to control those parameters instead of >>>>>> forcing a particular parameter configuration onto the developer. >>>>> >>>>> Consider also the point of view of the developer implementing a driv= er >>>>> for >>>>> some other piece of audio hardware that happens not to support >>>>> precisely >>>>> the same sample rates etc as VirtIO. It would be extremely ugly to >>>>> force >>>>> all future hardware to pretend to have the same capabilities as Virt= IO >>>>> just because the API was initially designed with VirtIO in mind. >>>>> >>>>> As a developer on the other side of the API, writing code to play >>>>> sound >>>>> files on an arbitrary unknown platform, I would prefer to simply >>>>> consume >>>>> as simple as possible an abstraction of an audio output protocol and >>>>> not >>>>> have to care about what hardware is actually implementing it. >>>>> >>>> >>>> It may make sense to have an API to load the buffer/stream and other >>>> APIs >>>> to >>>> play or pause. This could allow an optional API to configure how the >>>> stream >>>> is played back. If we add a version to the Protocol that would at lea= st >>>> future proof us. >>>> >>>> We did get feedback that it is very common to speed up the auto >>>> playback >>>> rates for accessibility. I=E2=80=99m not sure if that is practical wi= th a >>>> simple >>>> PCM >>>> 16 wave file with the firmware audio implementation. I guess that is >>>> something we could investigate. >>>> >>>> In terms of maybe adding text to speech there is an open source proje= ct >>>> that >>>> conceptually we could port to EFI. It would likely be a binary that >>>> would >>>> have to live on the EFI System Partition due to size. I was thinking >>>> that >>>> words per minute could be part of that API and it would produce a PCM >>>> 16 >>>> wave file that the audio protocol we are discussing could play. >>>> >>>>> Both of these argue in favour of defining a very simple API that >>>>> expresses >>>>> only a common baseline capability that is plausibly implementable fo= r >>>>> every piece of audio hardware ever made. >>>>> >>>>> Coupled with the relatively minimalistic requirements for boot-time >>>>> audio, >>>>> I'd probably suggest supporting only a single format for audio data, >>>>> with >>>>> a fixed sample rate (and possibly only mono output). >>>>> >>>> >>>> In my world the folks that work for Jony asked for a stereo boot bong >>>> to >>>> transition from left to right :). This is not the CODEC you are looki= ng >>>> for >>>> was our response=E2=80=A6. I also did not mention that some languages= are right >>>> to >>>> left, as the only thing worse than one complex thing is two complex >>>> things >>>> to implement. >>>> >>>>> As always: perfection is achieved, not when there is nothing more to >>>>> add, >>>>> but when there is nothing left to take away. :) >>>>> >>>> >>>> "Simplicity is the ultimate sophistication=E2=80=9D >>>> >>>> Thanks, >>>> >>>> Andrew Fish >>>> >>>>> Thanks, >>>>> >>>>> Michael >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >>> -- >>> Signed, >>> Ethin D. Probst >> >> > > > -- > Signed, > Ethin D. Probst > > >=20 > > > --=20 Signed, Ethin D. Probst