From: "Andrew Fish" <afish@apple.com>
To: edk2-devel-groups-io <devel@edk2.groups.io>,
"Marvin Häuser" <mhaeuser@posteo.de>
Cc: Ethin Probst <harlydavidsen@gmail.com>,
Michael Brown <mcb30@ipxe.org>,
Mike Kinney <michael.d.kinney@intel.com>,
Leif Lindholm <leif@nuviainc.com>,
Laszlo Ersek <lersek@redhat.com>,
"Desimone, Nathaniel L" <nathaniel.l.desimone@intel.com>,
Rafael Rodrigues Machado <rafaelrodrigues.machado@gmail.com>,
Gerd Hoffmann <kraxel@redhat.com>
Subject: Re: [edk2-devel] VirtIO Sound Driver (GSoC 2021)
Date: Fri, 16 Apr 2021 07:34:27 -0700 [thread overview]
Message-ID: <2296BE7E-ACEE-4286-9A5C-408B2D1ADC2E@apple.com> (raw)
In-Reply-To: <406e5bdb-f6aa-21ce-c96a-b16fb07c181d@posteo.de>
[-- Attachment #1: Type: text/plain, Size: 15862 bytes --]
> On Apr 16, 2021, at 6:22 AM, Marvin Häuser <mhaeuser@posteo.de> wrote:
>
> Good day,
>
> Sorry for the nitpicking.
>
> - Protocols always need a "Revision" field as first member. This is used to be able to expand its capabilities in later revisions without introducing a new, distinct protocol.
> - Consider the name EFI_SIMPLE_AUDIO_OUTPUT(!)_PROTOCOL, to not cause confusion if input is ever added. Input in my opinion should be a separate protocol as there is no reason why they would necessarily be coupled topology-wise (think of an USB microphone, it will never have any sort of output).
> - To make code safety a bit easier, try to use "CONST" for "IN" (non-OUT) pointers, so that CONST can be propagated where possible.
> - Please do *not* make the events caller-owned. We had it multiple times already on production firmware that events are left dangling and may be polled/signaled after ExitBS(). The caller should be able to decide on some policy maybe (i.e. abort or block on ExitBS() until the playback finished), as cut-off audio may be awkward; but the callee definitely should implement "event safety" itself. Maybe avoid exposing events directly at all and provide nice abstractions the caller cannot misuse.
> - I don't think audio should be required at all, the required subset should firstly consider minimalism and security. Accessibility will not be of concern for some IoT device, the audio code would simply eat space, and introduce a larger surface for bugs.
>
Marvin,
Generally how we work this in the UEFI Specification is we make it optional via the following wording: “If a platform includes the ability to play audio in EFI then the EFI_SIMPLE_AUDIO_OUTPUT_PROTOCOL must be implemented.
Basically this requirement will get added to UEFI Specification 2.6.2 Platform-Specific Elements.
Thanks,
Andrew Fish
> Best regards,
> Marvin
>
> On 16.04.21 01:42, Ethin Probst wrote:
>> Hi Andrew,
>>
>> What would that protocol interface look like if we utilized your idea?
>> With mine (though I need to add channel mapping as well), your
>> workflow for playing a stereo sound from left to right would probably
>> be something like this:
>> 1) Encode the sound using a standard tool into a Wave PCM 16.
>> 2) Place the Wave file in the Firmware Volume using a given UUID as
>> the name. As simple as editing the platform FDF file.
>> 3) Write some BDS code
>> a) Lookup Wave file by UUID and read it into memory.
>> b) Decode the audio file (audio devices will not do this decoding
>> for you, you have to do that yourself).
>> c) Call EFI_AUDIO_PROTOCOL.LoadBuffer(), passing in the sample rate
>> of your audio, EFI_AUDIO_PROTOCOL_SAMPLE_FORMAT_S16 for signed 16-bit
>> PCM audio, the channel mapping, the number of samples, and the samples
>> themselves.
>> d) call EFI_BOOT_SERVICES.CreateEvent()/EFI_BOOT_SERVICES.CreateEventEx()
>> for a playback event to signal.
>> e) call EFI_AUDIO_PROTOCOL.StartPlayback(), passing in the event you
>> just created.
>> The reason that LoadBuffer() takes so many parameters is because the
>> device does not know the audio that your passing in. If I'm given an
>> array of 16-bit audio samples, its impossible to know the parameters
>> (sample rate, sample format, channel mapping, etc.) from that alone.
>> Using your idea, though, my protocol could be greatly simplified.
>> Forcing a particular channel mapping, sample rate and sample format on
>> everyone would complicate application code. From an application point
>> of view, one would, with that type of protocol, need to do the
>> following:
>> 1) Load an audio file in any audio file format from any storage mechanism.
>> 2) Decode the audio file format to extract the samples and audio metadata.
>> 3) Resample the (now decoded) audio samples and convert (quantize) the
>> audio samples into signed 16-bit PCM audio.
>> 4) forward the samples onto the EFI audio protocol.
>> There is another option. (I'm happy we're discussing this now -- we
>> can hammer out all the details now which will make a lot of things
>> easier.) Since I'll most likely end up splitting the device-specific
>> interfaces to different audio protocols, we could make a simple audio
>> protocol that makes various assumptions about the audio samples your
>> giving it (e.g.: sample rate, format, ...). This would just allow
>> audio output and input in signed 16-bit PCM audio, as you've
>> suggested, and would be a simple and easy to use interface. Something
>> like:
>> typedef struct EFI_SIMPLE_AUDIO_PROTOCOL {
>> EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset;
>> EFI_SIMPLE_AUDIO_PROTOCOL_START Start;
>> EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop;
>> } EFI_SIMPLE_AUDIO_PROTOCOL;
>> This way, users and driver developers have a simple audio protocol
>> they can implement if they like. It would assume signed 16-bit PCM
>> audio and mono channel mappings at 44100 Hz. Then, we can have an
>> advanced protocol for each device type (HDA, USB, VirtIO, ...) that
>> exposes all the knobs for sample formats, sample rates, that kind of
>> thing. Obviously, like the majority (if not all) UEFI protocols, these
>> advanced protocols would be optional. I think, however, that the
>> simple audio protocol should be a required protocol in all UEFI
>> implementations. But that might not be possible. So would this simpler
>> interface work as a starting point?
>>
>> On 4/15/21, Andrew Fish <afish@apple.com> wrote:
>>>
>>>> On Apr 15, 2021, at 1:11 PM, Ethin Probst <harlydavidsen@gmail.com>
>>>> wrote:
>>>>
>>>>> Is there any necessity for audio input and output to be implemented
>>>>> within the same protocol? Unlike a network device (which is
>>>>> intrinsically bidirectional), it seems natural to conceptually separate
>>>>> audio input from audio output.
>>>> Nope, there isn't a necessity to make them in one, they can be
>>>> separated into two.
>>>>
>>>>> The code controlling volume/mute may not have any access to the sample
>>>>> buffer. The most natural implementation would seem to allow for a
>>>>> platform to notice volume up/down keypresses and use those to control the
>>>>> overall system volume, without any knowledge of which samples (if any)
>>>>> are currently being played by other code in the system.
>>>> Your assuming that the audio device your implementing the
>>>> volume/muting has volume control and muting functionality within
>>>> itself, then.
>>> Not really. We are assuming that audio hardware has a better understanding
>>> of how that system implements volume than some generic EFI Code that is by
>>> definition platform agnostic.
>>>
>>>> This may not be the case, and so we'd need to
>>>> effectively simulate it within the driver, which isn't too hard to do.
>>>> As an example, the VirtIO driver does not have a request type for
>>>> muting or for volume control (this would, most likely, be within the
>>>> VIRTIO_SND_R_PCM_SET_PARAMS request, see sec. 5.14.6.4.3). Therefore,
>>>> either the driver would have to simulate the request or return
>>>> EFI_UNSUPPORTED this instance.
>>>>
>>> So this is an example of above since the audio hardware knows it is routing
>>> the sound output into another subsystem, and that subsystem controls the
>>> volume. So the VirtIo Sound Driver know best how to bstract volume/mute for
>>> this platform.
>>>
>>>>> Consider also the point of view of the developer implementing a driver
>>>>> for some other piece of audio hardware that happens not to support
>>>>> precisely the same sample rates etc as VirtIO. It would be extremely
>>>>> ugly to force all future hardware to pretend to have the same
>>>>> capabilities as VirtIO just because the API was initially designed with
>>>>> VirtIO in mind.
>>>> Precisely, but the brilliance of VirtIO
>>> The brilliance of VirtIO is that it just needs to implement a generic device
>>> driver for a given operating system. In most cases these operating systems
>>> have sounds subsystems that manage sound and want fine granularity of
>>> control on what is going on. So the drivers are implemented to maximizes
>>> flexibility since the OS has lots of generic code that deals with sound, and
>>> even user configurable knobs to control audio. In our case that extra layer
>>> does not exist in EFI and the end user code just want to tell the driver do
>>> some simple things.
>>>
>>> Maybe it is easier to think about with an example. Lets say I want to play a
>>> cool sound on every boot. What would be the workflow to make the happen.
>>> 1) Encode the sound using a standard tool into a Wave PCM 16.
>>> 2) Place the Wave file in the Firmware Volume using a given UUID as the
>>> name. As simple as editing the platform FDF file.
>>> 3) Write some BDS code
>>> a) Lookup Wave file by UUID and read it into memory.
>>> b) Point the EFI Sound Protocol at the buffer with the Wave file
>>> c) Tell the EFI Sound Protocol to play the sound.
>>>
>>> If you start adding in a lot of perimeters that work flow starts getting
>>> really complicated really quickly.
>>>
>>>> is that the sample rate,
>>>> sample format, &c., do not have to all be supported by a VirtIO
>>>> device. Notice, also, how in my protocol proposal I noted that the
>>>> sample rates, at least, were "recommended," not "required." Should a
>>>> device not happen to support a sample rate or sample format, all it
>>>> needs to do is return EFI_INVALID_PARAMETER. Section 5.14.6.2.1
>>>> (VIRTIO_SND_R_JACK_GET_CONFIG) describes how a jack tells you what
>>>> sample rates it supports, channel mappings, &c.
>>>>
>>>> I do understand how just using a predefined sample rate and sample
>>>> format might be a good idea, and if that's the best way, then that's
>>>> what we'll do. The protocol can always be revised at a later time if
>>>> necessary. I apologize if my stance seems obstinate.
>>>>
>>> I think if we add the version into the protocol and make sure we have a
>>> separate load and play operation we could add a member to set the extra
>>> perimeters if needed. There might also be some platform specific generic
>>> tunables that might be interesting for a future member function.
>>>
>>> Thanks,
>>>
>>> Andrew Fish
>>>
>>>> Also, thank you, Laszlo, for your advice -- I hadn't considered that a
>>>> network driver would be another good way of figuring out how async
>>>> works in UEFI.
>>>>
>>>> On 4/15/21, Andrew Fish <afish@apple.com> wrote:
>>>>>
>>>>>> On Apr 15, 2021, at 5:07 AM, Michael Brown <mcb30@ipxe.org> wrote:
>>>>>>
>>>>>> On 15/04/2021 06:28, Ethin Probst wrote:
>>>>>>> - I hoped to add recording in case we in future want to add
>>>>>>> accessibility aids like speech recognition (that was one of the todo
>>>>>>> tasks on the EDK2 tasks list)
>>>>>> Is there any necessity for audio input and output to be implemented
>>>>>> within
>>>>>> the same protocol? Unlike a network device (which is intrinsically
>>>>>> bidirectional), it seems natural to conceptually separate audio input
>>>>>> from
>>>>>> audio output.
>>>>>>
>>>>>>> - Muting and volume control could easily be added by just replacing
>>>>>>> the sample buffer with silence and by multiplying all the samples by a
>>>>>>> percentage.
>>>>>> The code controlling volume/mute may not have any access to the sample
>>>>>> buffer. The most natural implementation would seem to allow for a
>>>>>> platform to notice volume up/down keypresses and use those to control
>>>>>> the
>>>>>> overall system volume, without any knowledge of which samples (if any)
>>>>>> are
>>>>>> currently being played by other code in the system.
>>>>>>
>>>>> I’ve also thought of adding NVRAM variable that would let setup, UEFI
>>>>> Shell,
>>>>> or even the OS set the current volume, and Mute. This how it would be
>>>>> consumed concept is why I proposed mute and volume being separate APIs.
>>>>> The
>>>>> volume up/down API in addition to fixed percentage might be overkill, but
>>>>> it
>>>>> does allow a non liner mapping to the volume up/down keys. You would be
>>>>> surprised how picky audiophiles can be and it seems they like to file
>>>>> Bugzillas.
>>>>>
>>>>>>> - Finally, the reason I used enumerations for specifying parameters
>>>>>>> like sample rate and stuff was that I was looking at this protocol
>>>>>>> from a general UEFI applications point of view. VirtIO supports all of
>>>>>>> the sample configurations listed in my gist, and it seems reasonable
>>>>>>> to allow the application to control those parameters instead of
>>>>>>> forcing a particular parameter configuration onto the developer.
>>>>>> Consider also the point of view of the developer implementing a driver
>>>>>> for
>>>>>> some other piece of audio hardware that happens not to support
>>>>>> precisely
>>>>>> the same sample rates etc as VirtIO. It would be extremely ugly to
>>>>>> force
>>>>>> all future hardware to pretend to have the same capabilities as VirtIO
>>>>>> just because the API was initially designed with VirtIO in mind.
>>>>>>
>>>>>> As a developer on the other side of the API, writing code to play sound
>>>>>> files on an arbitrary unknown platform, I would prefer to simply
>>>>>> consume
>>>>>> as simple as possible an abstraction of an audio output protocol and
>>>>>> not
>>>>>> have to care about what hardware is actually implementing it.
>>>>>>
>>>>> It may make sense to have an API to load the buffer/stream and other APIs
>>>>> to
>>>>> play or pause. This could allow an optional API to configure how the
>>>>> stream
>>>>> is played back. If we add a version to the Protocol that would at least
>>>>> future proof us.
>>>>>
>>>>> We did get feedback that it is very common to speed up the auto playback
>>>>> rates for accessibility. I’m not sure if that is practical with a simple
>>>>> PCM
>>>>> 16 wave file with the firmware audio implementation. I guess that is
>>>>> something we could investigate.
>>>>>
>>>>> In terms of maybe adding text to speech there is an open source project
>>>>> that
>>>>> conceptually we could port to EFI. It would likely be a binary that
>>>>> would
>>>>> have to live on the EFI System Partition due to size. I was thinking
>>>>> that
>>>>> words per minute could be part of that API and it would produce a PCM 16
>>>>> wave file that the audio protocol we are discussing could play.
>>>>>
>>>>>> Both of these argue in favour of defining a very simple API that
>>>>>> expresses
>>>>>> only a common baseline capability that is plausibly implementable for
>>>>>> every piece of audio hardware ever made.
>>>>>>
>>>>>> Coupled with the relatively minimalistic requirements for boot-time
>>>>>> audio,
>>>>>> I'd probably suggest supporting only a single format for audio data,
>>>>>> with
>>>>>> a fixed sample rate (and possibly only mono output).
>>>>>>
>>>>> In my world the folks that work for Jony asked for a stereo boot bong to
>>>>> transition from left to right :). This is not the CODEC you are looking
>>>>> for
>>>>> was our response…. I also did not mention that some languages are right
>>>>> to
>>>>> left, as the only thing worse than one complex thing is two complex
>>>>> things
>>>>> to implement.
>>>>>
>>>>>> As always: perfection is achieved, not when there is nothing more to
>>>>>> add,
>>>>>> but when there is nothing left to take away. :)
>>>>>>
>>>>> "Simplicity is the ultimate sophistication”
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Andrew Fish
>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Michael
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Signed,
>>>> Ethin D. Probst
>>>
>>
>
>
>
>
[-- Attachment #2: Type: text/html, Size: 29760 bytes --]
next prev parent reply other threads:[~2021-04-16 14:34 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-29 20:28 VirtIO Sound Driver (GSoC 2021) Ethin Probst
2021-03-30 3:17 ` [edk2-devel] " Nate DeSimone
2021-03-30 14:41 ` Ethin Probst
2021-03-31 6:34 ` Nate DeSimone
2021-03-30 21:50 ` Rafael Machado
2021-03-30 22:12 ` Ethin Probst
[not found] ` <16713E6D64EE917D.25648@groups.io>
2021-03-31 0:01 ` Ethin Probst
2021-03-31 5:02 ` Andrew Fish
2021-03-31 6:41 ` Nate DeSimone
2021-04-01 5:05 ` Ethin Probst
2021-04-05 4:42 ` Andrew Fish
2021-04-05 4:43 ` Ethin Probst
2021-04-05 5:10 ` Andrew Fish
2021-04-06 14:16 ` Laszlo Ersek
2021-04-06 15:17 ` Ethin Probst
2021-04-13 12:28 ` Leif Lindholm
2021-04-13 14:16 ` Andrew Fish
2021-04-13 16:53 ` Ethin Probst
2021-04-13 21:38 ` Andrew Fish
2021-04-13 23:47 ` Ethin Probst
[not found] ` <16758FB6436B1195.32393@groups.io>
2021-04-14 1:20 ` Ethin Probst
2021-04-14 3:52 ` Andrew Fish
2021-04-14 20:47 ` Ethin Probst
2021-04-14 22:30 ` Michael D Kinney
2021-04-14 23:54 ` Andrew Fish
2021-04-15 4:01 ` Ethin Probst
2021-04-15 4:58 ` Andrew Fish
2021-04-15 5:28 ` Ethin Probst
2021-04-15 12:07 ` Michael Brown
2021-04-15 16:42 ` Andrew Fish
2021-04-15 20:11 ` Ethin Probst
2021-04-15 22:35 ` Andrew Fish
2021-04-15 23:42 ` Ethin Probst
2021-04-16 0:59 ` Michael Brown
2021-04-16 5:13 ` Andrew Fish
2021-04-16 5:33 ` Ethin Probst
2021-04-16 11:34 ` Leif Lindholm
2021-04-16 17:03 ` Andrew Fish
2021-04-16 17:45 ` Ethin Probst
2021-04-16 17:55 ` Ethin Probst
2021-04-16 18:09 ` Andrew Fish
2021-04-16 23:50 ` Ethin Probst
2021-04-16 18:02 ` Andrew Fish
2021-04-17 16:51 ` Marvin Häuser
2021-04-17 17:31 ` Andrew Fish
2021-04-17 18:04 ` Marvin Häuser
2021-04-18 8:55 ` Ethin Probst
2021-04-18 15:22 ` Andrew Fish
2021-04-18 19:11 ` Marvin Häuser
2021-04-18 19:22 ` Marvin Häuser
2021-04-18 21:00 ` Andrew Fish
2021-04-16 13:22 ` Marvin Häuser
2021-04-16 14:34 ` Andrew Fish [this message]
2021-04-16 15:03 ` Marvin Häuser
[not found] ` <16762C957671127A.12361@groups.io>
2021-04-16 0:59 ` Ethin Probst
2021-04-16 1:03 ` Michael Brown
2021-04-16 2:06 ` Ethin Probst
2021-04-16 3:48 ` Andrew Fish
2021-04-16 4:29 ` Ethin Probst
2021-04-15 9:51 ` Laszlo Ersek
2021-04-15 16:01 ` Andrew Fish
2021-04-04 22:05 ` Marvin Häuser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2296BE7E-ACEE-4286-9A5C-408B2D1ADC2E@apple.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox