From: "Marvin Häuser" <mhaeuser@posteo.de>
To: devel@edk2.groups.io, afish@apple.com
Cc: Ethin Probst <harlydavidsen@gmail.com>,
Michael Brown <mcb30@ipxe.org>,
Mike Kinney <michael.d.kinney@intel.com>,
Leif Lindholm <leif@nuviainc.com>,
Laszlo Ersek <lersek@redhat.com>,
"Desimone, Nathaniel L" <nathaniel.l.desimone@intel.com>,
Rafael Rodrigues Machado <rafaelrodrigues.machado@gmail.com>,
Gerd Hoffmann <kraxel@redhat.com>
Subject: Re: [edk2-devel] VirtIO Sound Driver (GSoC 2021)
Date: Fri, 16 Apr 2021 15:03:39 +0000 [thread overview]
Message-ID: <22f0c186-b762-7790-92f6-e8347b73e26b@posteo.de> (raw)
In-Reply-To: <2296BE7E-ACEE-4286-9A5C-408B2D1ADC2E@apple.com>
On 16.04.21 16:34, Andrew Fish via groups.io wrote:
>
>
>> On Apr 16, 2021, at 6:22 AM, Marvin Häuser <mhaeuser@posteo.de
>> <mailto:mhaeuser@posteo.de>> wrote:
>>
>> Good day,
>>
>> Sorry for the nitpicking.
>>
>> - Protocols always need a "Revision" field as first member. This is
>> used to be able to expand its capabilities in later revisions without
>> introducing a new, distinct protocol.
>> - Consider the name EFI_SIMPLE_AUDIO_OUTPUT(!)_PROTOCOL, to not cause
>> confusion if input is ever added. Input in my opinion should be a
>> separate protocol as there is no reason why they would necessarily be
>> coupled topology-wise (think of an USB microphone, it will never have
>> any sort of output).
>> - To make code safety a bit easier, try to use "CONST" for "IN"
>> (non-OUT) pointers, so that CONST can be propagated where possible.
>> - Please do *not* make the events caller-owned. We had it multiple
>> times already on production firmware that events are left dangling
>> and may be polled/signaled after ExitBS(). The caller should be able
>> to decide on some policy maybe (i.e. abort or block on ExitBS() until
>> the playback finished), as cut-off audio may be awkward; but the
>> callee definitely should implement "event safety" itself. Maybe avoid
>> exposing events directly at all and provide nice abstractions the
>> caller cannot misuse.
>> - I don't think audio should be required at all, the required subset
>> should firstly consider minimalism and security. Accessibility will
>> not be of concern for some IoT device, the audio code would simply
>> eat space, and introduce a larger surface for bugs.
>>
>
> Marvin,
>
> Generally how we work this in the UEFI Specification is we make it
> optional via the following wording: “If a platform includes the
> ability to play audio in EFI then the EFI_SIMPLE_AUDIO_OUTPUT_PROTOCOL
> must be implemented.
Yes, this sounds good. Just saying it should not be strictly mandatory
on all platforms. :) Thanks!
Best regards,
Marvin
>
> Basically this requirement will get added to UEFI Specification 2.6.2
> Platform-Specific Elements.
>
> Thanks,
>
> Andrew Fish
>
>> Best regards,
>> Marvin
>>
>> On 16.04.21 01:42, Ethin Probst wrote:
>>> Hi Andrew,
>>>
>>> What would that protocol interface look like if we utilized your idea?
>>> With mine (though I need to add channel mapping as well), your
>>> workflow for playing a stereo sound from left to right would probably
>>> be something like this:
>>> 1) Encode the sound using a standard tool into a Wave PCM 16.
>>> 2) Place the Wave file in the Firmware Volume using a given UUID as
>>> the name. As simple as editing the platform FDF file.
>>> 3) Write some BDS code
>>> a) Lookup Wave file by UUID and read it into memory.
>>> b) Decode the audio file (audio devices will not do this decoding
>>> for you, you have to do that yourself).
>>> c) Call EFI_AUDIO_PROTOCOL.LoadBuffer(), passing in the sample rate
>>> of your audio, EFI_AUDIO_PROTOCOL_SAMPLE_FORMAT_S16 for signed 16-bit
>>> PCM audio, the channel mapping, the number of samples, and the samples
>>> themselves.
>>> d) call
>>> EFI_BOOT_SERVICES.CreateEvent()/EFI_BOOT_SERVICES.CreateEventEx()
>>> for a playback event to signal.
>>> e) call EFI_AUDIO_PROTOCOL.StartPlayback(), passing in the event you
>>> just created.
>>> The reason that LoadBuffer() takes so many parameters is because the
>>> device does not know the audio that your passing in. If I'm given an
>>> array of 16-bit audio samples, its impossible to know the parameters
>>> (sample rate, sample format, channel mapping, etc.) from that alone.
>>> Using your idea, though, my protocol could be greatly simplified.
>>> Forcing a particular channel mapping, sample rate and sample format on
>>> everyone would complicate application code. From an application point
>>> of view, one would, with that type of protocol, need to do the
>>> following:
>>> 1) Load an audio file in any audio file format from any storage
>>> mechanism.
>>> 2) Decode the audio file format to extract the samples and audio
>>> metadata.
>>> 3) Resample the (now decoded) audio samples and convert (quantize) the
>>> audio samples into signed 16-bit PCM audio.
>>> 4) forward the samples onto the EFI audio protocol.
>>> There is another option. (I'm happy we're discussing this now -- we
>>> can hammer out all the details now which will make a lot of things
>>> easier.) Since I'll most likely end up splitting the device-specific
>>> interfaces to different audio protocols, we could make a simple audio
>>> protocol that makes various assumptions about the audio samples your
>>> giving it (e.g.: sample rate, format, ...). This would just allow
>>> audio output and input in signed 16-bit PCM audio, as you've
>>> suggested, and would be a simple and easy to use interface. Something
>>> like:
>>> typedef struct EFI_SIMPLE_AUDIO_PROTOCOL {
>>> EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset;
>>> EFI_SIMPLE_AUDIO_PROTOCOL_START Start;
>>> EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop;
>>> } EFI_SIMPLE_AUDIO_PROTOCOL;
>>> This way, users and driver developers have a simple audio protocol
>>> they can implement if they like. It would assume signed 16-bit PCM
>>> audio and mono channel mappings at 44100 Hz. Then, we can have an
>>> advanced protocol for each device type (HDA, USB, VirtIO, ...) that
>>> exposes all the knobs for sample formats, sample rates, that kind of
>>> thing. Obviously, like the majority (if not all) UEFI protocols, these
>>> advanced protocols would be optional. I think, however, that the
>>> simple audio protocol should be a required protocol in all UEFI
>>> implementations. But that might not be possible. So would this simpler
>>> interface work as a starting point?
>>>
>>> On 4/15/21, Andrew Fish <afish@apple.com <mailto:afish@apple.com>>
>>> wrote:
>>>>
>>>>> On Apr 15, 2021, at 1:11 PM, Ethin Probst <harlydavidsen@gmail.com
>>>>> <mailto:harlydavidsen@gmail.com>>
>>>>> wrote:
>>>>>
>>>>>> Is there any necessity for audio input and output to be implemented
>>>>>> within the same protocol? Unlike a network device (which is
>>>>>> intrinsically bidirectional), it seems natural to conceptually
>>>>>> separate
>>>>>> audio input from audio output.
>>>>> Nope, there isn't a necessity to make them in one, they can be
>>>>> separated into two.
>>>>>
>>>>>> The code controlling volume/mute may not have any access to the
>>>>>> sample
>>>>>> buffer. The most natural implementation would seem to allow for a
>>>>>> platform to notice volume up/down keypresses and use those to
>>>>>> control the
>>>>>> overall system volume, without any knowledge of which samples (if
>>>>>> any)
>>>>>> are currently being played by other code in the system.
>>>>> Your assuming that the audio device your implementing the
>>>>> volume/muting has volume control and muting functionality within
>>>>> itself, then.
>>>> Not really. We are assuming that audio hardware has a better
>>>> understanding
>>>> of how that system implements volume than some generic EFI Code
>>>> that is by
>>>> definition platform agnostic.
>>>>
>>>>> This may not be the case, and so we'd need to
>>>>> effectively simulate it within the driver, which isn't too hard to do.
>>>>> As an example, the VirtIO driver does not have a request type for
>>>>> muting or for volume control (this would, most likely, be within the
>>>>> VIRTIO_SND_R_PCM_SET_PARAMS request, see sec. 5.14.6.4.3). Therefore,
>>>>> either the driver would have to simulate the request or return
>>>>> EFI_UNSUPPORTED this instance.
>>>>>
>>>> So this is an example of above since the audio hardware knows it is
>>>> routing
>>>> the sound output into another subsystem, and that subsystem
>>>> controls the
>>>> volume. So the VirtIo Sound Driver know best how to bstract
>>>> volume/mute for
>>>> this platform.
>>>>
>>>>>> Consider also the point of view of the developer implementing a
>>>>>> driver
>>>>>> for some other piece of audio hardware that happens not to support
>>>>>> precisely the same sample rates etc as VirtIO. It would be extremely
>>>>>> ugly to force all future hardware to pretend to have the same
>>>>>> capabilities as VirtIO just because the API was initially
>>>>>> designed with
>>>>>> VirtIO in mind.
>>>>> Precisely, but the brilliance of VirtIO
>>>> The brilliance of VirtIO is that it just needs to implement a
>>>> generic device
>>>> driver for a given operating system. In most cases these operating
>>>> systems
>>>> have sounds subsystems that manage sound and want fine granularity of
>>>> control on what is going on. So the drivers are implemented to
>>>> maximizes
>>>> flexibility since the OS has lots of generic code that deals with
>>>> sound, and
>>>> even user configurable knobs to control audio. In our case that
>>>> extra layer
>>>> does not exist in EFI and the end user code just want to tell the
>>>> driver do
>>>> some simple things.
>>>>
>>>> Maybe it is easier to think about with an example. Lets say I want
>>>> to play a
>>>> cool sound on every boot. What would be the workflow to make the
>>>> happen.
>>>> 1) Encode the sound using a standard tool into a Wave PCM 16.
>>>> 2) Place the Wave file in the Firmware Volume using a given UUID as the
>>>> name. As simple as editing the platform FDF file.
>>>> 3) Write some BDS code
>>>> a) Lookup Wave file by UUID and read it into memory.
>>>> b) Point the EFI Sound Protocol at the buffer with the Wave file
>>>> c) Tell the EFI Sound Protocol to play the sound.
>>>>
>>>> If you start adding in a lot of perimeters that work flow starts
>>>> getting
>>>> really complicated really quickly.
>>>>
>>>>> is that the sample rate,
>>>>> sample format, &c., do not have to all be supported by a VirtIO
>>>>> device. Notice, also, how in my protocol proposal I noted that the
>>>>> sample rates, at least, were "recommended," not "required." Should a
>>>>> device not happen to support a sample rate or sample format, all it
>>>>> needs to do is return EFI_INVALID_PARAMETER. Section 5.14.6.2.1
>>>>> (VIRTIO_SND_R_JACK_GET_CONFIG) describes how a jack tells you what
>>>>> sample rates it supports, channel mappings, &c.
>>>>>
>>>>> I do understand how just using a predefined sample rate and sample
>>>>> format might be a good idea, and if that's the best way, then that's
>>>>> what we'll do. The protocol can always be revised at a later time if
>>>>> necessary. I apologize if my stance seems obstinate.
>>>>>
>>>> I think if we add the version into the protocol and make sure we have a
>>>> separate load and play operation we could add a member to set the extra
>>>> perimeters if needed. There might also be some platform specific
>>>> generic
>>>> tunables that might be interesting for a future member function.
>>>>
>>>> Thanks,
>>>>
>>>> Andrew Fish
>>>>
>>>>> Also, thank you, Laszlo, for your advice -- I hadn't considered that a
>>>>> network driver would be another good way of figuring out how async
>>>>> works in UEFI.
>>>>>
>>>>> On 4/15/21, Andrew Fish <afish@apple.com <mailto:afish@apple.com>>
>>>>> wrote:
>>>>>>
>>>>>>> On Apr 15, 2021, at 5:07 AM, Michael Brown <mcb30@ipxe.org
>>>>>>> <mailto:mcb30@ipxe.org>> wrote:
>>>>>>>
>>>>>>> On 15/04/2021 06:28, Ethin Probst wrote:
>>>>>>>> - I hoped to add recording in case we in future want to add
>>>>>>>> accessibility aids like speech recognition (that was one of the
>>>>>>>> todo
>>>>>>>> tasks on the EDK2 tasks list)
>>>>>>> Is there any necessity for audio input and output to be implemented
>>>>>>> within
>>>>>>> the same protocol? Unlike a network device (which is intrinsically
>>>>>>> bidirectional), it seems natural to conceptually separate audio
>>>>>>> input
>>>>>>> from
>>>>>>> audio output.
>>>>>>>
>>>>>>>> - Muting and volume control could easily be added by just replacing
>>>>>>>> the sample buffer with silence and by multiplying all the
>>>>>>>> samples by a
>>>>>>>> percentage.
>>>>>>> The code controlling volume/mute may not have any access to the
>>>>>>> sample
>>>>>>> buffer. The most natural implementation would seem to allow for a
>>>>>>> platform to notice volume up/down keypresses and use those to
>>>>>>> control
>>>>>>> the
>>>>>>> overall system volume, without any knowledge of which samples
>>>>>>> (if any)
>>>>>>> are
>>>>>>> currently being played by other code in the system.
>>>>>>>
>>>>>> I’ve also thought of adding NVRAM variable that would let setup, UEFI
>>>>>> Shell,
>>>>>> or even the OS set the current volume, and Mute. This how it would be
>>>>>> consumed concept is why I proposed mute and volume being separate
>>>>>> APIs.
>>>>>> The
>>>>>> volume up/down API in addition to fixed percentage might be
>>>>>> overkill, but
>>>>>> it
>>>>>> does allow a non liner mapping to the volume up/down keys. You
>>>>>> would be
>>>>>> surprised how picky audiophiles can be and it seems they like to file
>>>>>> Bugzillas.
>>>>>>
>>>>>>>> - Finally, the reason I used enumerations for specifying parameters
>>>>>>>> like sample rate and stuff was that I was looking at this protocol
>>>>>>>> from a general UEFI applications point of view. VirtIO supports
>>>>>>>> all of
>>>>>>>> the sample configurations listed in my gist, and it seems
>>>>>>>> reasonable
>>>>>>>> to allow the application to control those parameters instead of
>>>>>>>> forcing a particular parameter configuration onto the developer.
>>>>>>> Consider also the point of view of the developer implementing a
>>>>>>> driver
>>>>>>> for
>>>>>>> some other piece of audio hardware that happens not to support
>>>>>>> precisely
>>>>>>> the same sample rates etc as VirtIO. It would be extremely ugly to
>>>>>>> force
>>>>>>> all future hardware to pretend to have the same capabilities as
>>>>>>> VirtIO
>>>>>>> just because the API was initially designed with VirtIO in mind.
>>>>>>>
>>>>>>> As a developer on the other side of the API, writing code to
>>>>>>> play sound
>>>>>>> files on an arbitrary unknown platform, I would prefer to simply
>>>>>>> consume
>>>>>>> as simple as possible an abstraction of an audio output protocol and
>>>>>>> not
>>>>>>> have to care about what hardware is actually implementing it.
>>>>>>>
>>>>>> It may make sense to have an API to load the buffer/stream and
>>>>>> other APIs
>>>>>> to
>>>>>> play or pause. This could allow an optional API to configure how the
>>>>>> stream
>>>>>> is played back. If we add a version to the Protocol that would at
>>>>>> least
>>>>>> future proof us.
>>>>>>
>>>>>> We did get feedback that it is very common to speed up the auto
>>>>>> playback
>>>>>> rates for accessibility. I’m not sure if that is practical with a
>>>>>> simple
>>>>>> PCM
>>>>>> 16 wave file with the firmware audio implementation. I guess that is
>>>>>> something we could investigate.
>>>>>>
>>>>>> In terms of maybe adding text to speech there is an open source
>>>>>> project
>>>>>> that
>>>>>> conceptually we could port to EFI. It would likely be a binary that
>>>>>> would
>>>>>> have to live on the EFI System Partition due to size. I was thinking
>>>>>> that
>>>>>> words per minute could be part of that API and it would produce a
>>>>>> PCM 16
>>>>>> wave file that the audio protocol we are discussing could play.
>>>>>>
>>>>>>> Both of these argue in favour of defining a very simple API that
>>>>>>> expresses
>>>>>>> only a common baseline capability that is plausibly
>>>>>>> implementable for
>>>>>>> every piece of audio hardware ever made.
>>>>>>>
>>>>>>> Coupled with the relatively minimalistic requirements for boot-time
>>>>>>> audio,
>>>>>>> I'd probably suggest supporting only a single format for audio data,
>>>>>>> with
>>>>>>> a fixed sample rate (and possibly only mono output).
>>>>>>>
>>>>>> In my world the folks that work for Jony asked for a stereo boot
>>>>>> bong to
>>>>>> transition from left to right :). This is not the CODEC you are
>>>>>> looking
>>>>>> for
>>>>>> was our response…. I also did not mention that some languages are
>>>>>> right
>>>>>> to
>>>>>> left, as the only thing worse than one complex thing is two complex
>>>>>> things
>>>>>> to implement.
>>>>>>
>>>>>>> As always: perfection is achieved, not when there is nothing more to
>>>>>>> add,
>>>>>>> but when there is nothing left to take away. :)
>>>>>>>
>>>>>> "Simplicity is the ultimate sophistication”
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Andrew Fish
>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Michael
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Signed,
>>>>> Ethin D. Probst
>>>>
>>>
>>
>>
>>
>
>
next prev parent reply other threads:[~2021-04-16 15:03 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-29 20:28 VirtIO Sound Driver (GSoC 2021) Ethin Probst
2021-03-30 3:17 ` [edk2-devel] " Nate DeSimone
2021-03-30 14:41 ` Ethin Probst
2021-03-31 6:34 ` Nate DeSimone
2021-03-30 21:50 ` Rafael Machado
2021-03-30 22:12 ` Ethin Probst
[not found] ` <16713E6D64EE917D.25648@groups.io>
2021-03-31 0:01 ` Ethin Probst
2021-03-31 5:02 ` Andrew Fish
2021-03-31 6:41 ` Nate DeSimone
2021-04-01 5:05 ` Ethin Probst
2021-04-05 4:42 ` Andrew Fish
2021-04-05 4:43 ` Ethin Probst
2021-04-05 5:10 ` Andrew Fish
2021-04-06 14:16 ` Laszlo Ersek
2021-04-06 15:17 ` Ethin Probst
2021-04-13 12:28 ` Leif Lindholm
2021-04-13 14:16 ` Andrew Fish
2021-04-13 16:53 ` Ethin Probst
2021-04-13 21:38 ` Andrew Fish
2021-04-13 23:47 ` Ethin Probst
[not found] ` <16758FB6436B1195.32393@groups.io>
2021-04-14 1:20 ` Ethin Probst
2021-04-14 3:52 ` Andrew Fish
2021-04-14 20:47 ` Ethin Probst
2021-04-14 22:30 ` Michael D Kinney
2021-04-14 23:54 ` Andrew Fish
2021-04-15 4:01 ` Ethin Probst
2021-04-15 4:58 ` Andrew Fish
2021-04-15 5:28 ` Ethin Probst
2021-04-15 12:07 ` Michael Brown
2021-04-15 16:42 ` Andrew Fish
2021-04-15 20:11 ` Ethin Probst
2021-04-15 22:35 ` Andrew Fish
2021-04-15 23:42 ` Ethin Probst
2021-04-16 0:59 ` Michael Brown
2021-04-16 5:13 ` Andrew Fish
2021-04-16 5:33 ` Ethin Probst
2021-04-16 11:34 ` Leif Lindholm
2021-04-16 17:03 ` Andrew Fish
2021-04-16 17:45 ` Ethin Probst
2021-04-16 17:55 ` Ethin Probst
2021-04-16 18:09 ` Andrew Fish
2021-04-16 23:50 ` Ethin Probst
2021-04-16 18:02 ` Andrew Fish
2021-04-17 16:51 ` Marvin Häuser
2021-04-17 17:31 ` Andrew Fish
2021-04-17 18:04 ` Marvin Häuser
2021-04-18 8:55 ` Ethin Probst
2021-04-18 15:22 ` Andrew Fish
2021-04-18 19:11 ` Marvin Häuser
2021-04-18 19:22 ` Marvin Häuser
2021-04-18 21:00 ` Andrew Fish
2021-04-16 13:22 ` Marvin Häuser
2021-04-16 14:34 ` Andrew Fish
2021-04-16 15:03 ` Marvin Häuser [this message]
[not found] ` <16762C957671127A.12361@groups.io>
2021-04-16 0:59 ` Ethin Probst
2021-04-16 1:03 ` Michael Brown
2021-04-16 2:06 ` Ethin Probst
2021-04-16 3:48 ` Andrew Fish
2021-04-16 4:29 ` Ethin Probst
2021-04-15 9:51 ` Laszlo Ersek
2021-04-15 16:01 ` Andrew Fish
2021-04-04 22:05 ` Marvin Häuser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=22f0c186-b762-7790-92f6-e8347b73e26b@posteo.de \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox