From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from rn-mailsvcp-ppex-lapp44.apple.com (rn-mailsvcp-ppex-lapp44.apple.com [17.179.253.48])
 by mx.groups.io with SMTP id smtpd.web10.16746.1618779638659227367
 for <devel@edk2.groups.io>;
 Sun, 18 Apr 2021 14:00:38 -0700
Authentication-Results: mx.groups.io;
 dkim=pass header.i=@apple.com header.s=20180706 header.b=lxztPsnp;
 spf=pass (domain: apple.com, ip: 17.179.253.48, mailfrom: afish@apple.com)
Received: from pps.filterd (rn-mailsvcp-ppex-lapp44.rno.apple.com [127.0.0.1])
	by rn-mailsvcp-ppex-lapp44.rno.apple.com (8.16.1.2/8.16.1.2) with SMTP id 13IKvV8l006854;
	Sun, 18 Apr 2021 14:00:38 -0700
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=from : message-id :
 content-type : mime-version : subject : date : in-reply-to : cc : to :
 references; s=20180706; bh=qxtwNw7QWhTZydfWKvXhvArsQacKWCCsNyZbCweII2E=;
 b=lxztPsnp99vka7PTxN+bfyeWIXnGd/d/H4dQP3zi+Q25vW761F5zBKjr8caaqEkiP/YX
 bd/K10MP4uULyrAiycFZ+q449KMDpJXE2QFLwdD6GrioUS6e6s7S2tXcbQHLIJ1LuWoD
 ZzkA1rV5s0D8zuY1Jbz7c75qvfCICjImecwdYFp+2G9fssfxrZz5D/lHsOcUZPIcfTuI
 xEPj8hH1z+zuOQKKaQcXNNz1/nsdNhRTSjLq+kz0K513Ohn8JRcoE6MutB01lVTc58Wf
 UsyAZ70ht19w6IxYdlQTiHXeD+z1IPUfCZnNBobzOotgn8wJkKk3IOFH2KQwcsEq4rkA hQ== 
Received: from rn-mailsvcp-mta-lapp04.rno.apple.com (rn-mailsvcp-mta-lapp04.rno.apple.com [10.225.203.152])
	by rn-mailsvcp-ppex-lapp44.rno.apple.com with ESMTP id 37yu67dqff-1
	(version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO);
	Sun, 18 Apr 2021 14:00:38 -0700
Received: from rn-mailsvcp-mmp-lapp03.rno.apple.com
 (rn-mailsvcp-mmp-lapp03.rno.apple.com [17.179.253.16])
 by rn-mailsvcp-mta-lapp04.rno.apple.com
 (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec  3
 2020)) with ESMTPS id <0QRS00QMD2D2CP00@rn-mailsvcp-mta-lapp04.rno.apple.com>;
 Sun, 18 Apr 2021 14:00:38 -0700 (PDT)
Received: from process_milters-daemon.rn-mailsvcp-mmp-lapp03.rno.apple.com by
 rn-mailsvcp-mmp-lapp03.rno.apple.com
 (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec  3
 2020)) id <0QRS00W002COST00@rn-mailsvcp-mmp-lapp03.rno.apple.com>; Sun,
 18 Apr 2021 14:00:38 -0700 (PDT)
X-Va-A: 
X-Va-T-CD: cc354bcf01ea39de908abab4e73c9ec0
X-Va-E-CD: 4730c80ee67030d4f2c83e40b4ab0357
X-Va-R-CD: 6f0325faf294bd23a6d751c620be9d51
X-Va-CD: 0
X-Va-ID: c5476d82-bbe4-4b24-9c43-b036327b08b8
X-V-A: 
X-V-T-CD: cc354bcf01ea39de908abab4e73c9ec0
X-V-E-CD: 4730c80ee67030d4f2c83e40b4ab0357
X-V-R-CD: 6f0325faf294bd23a6d751c620be9d51
X-V-CD: 0
X-V-ID: 6ac2534c-b844-4768-8fcf-3b25d31d5b1b
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.761
 definitions=2021-04-18_12:2021-04-16,2021-04-18 signatures=0
Received: from [17.235.52.5] (unknown [17.235.52.5])
 by rn-mailsvcp-mmp-lapp03.rno.apple.com
 (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec  3
 2020))
 with ESMTPSA id <0QRS00U132CZ7F00@rn-mailsvcp-mmp-lapp03.rno.apple.com>; Sun,
 18 Apr 2021 14:00:37 -0700 (PDT)
From: "Andrew Fish" <afish@apple.com>
Message-id: <537C3A1C-044A-49AD-86A1-F374DCA294E3@apple.com>
MIME-version: 1.0 (Mac OS X Mail 14.0 \(3654.20.0.2.1\))
Subject: Re: [edk2-devel] VirtIO Sound Driver (GSoC 2021)
Date: Sun, 18 Apr 2021 14:00:35 -0700
In-reply-to: <7f129be1-d8b9-bc4f-958f-21e5ac6cc3d9@posteo.de>
Cc: Ethin Probst <harlydavidsen@gmail.com>, Leif Lindholm <leif@nuviainc.com>,
        Michael Brown <mcb30@ipxe.org>,
        Mike Kinney <michael.d.kinney@intel.com>,
        Laszlo Ersek <lersek@redhat.com>,
        "Desimone, Nathaniel L" <nathaniel.l.desimone@intel.com>,
        Rafael Rodrigues Machado <rafaelrodrigues.machado@gmail.com>,
        Gerd Hoffmann <kraxel@redhat.com>
To: devel@edk2.groups.io, mhaeuser@posteo.de
References: <4AEC1784-99AF-47EF-B7DD-77F91EA3D7E9@apple.com>
 <309cc5ca-2ecd-79dd-b183-eec0572ea982@ipxe.org>
 <A139650C-A76F-4471-AFCC-FFF1BE2E35BB@apple.com>
 <CAJQtwF3kuOD3C2arUfZu_xDbkHq5HHz+LYNB2=AeV8x+q_cPtw@mail.gmail.com>
 <CCB65CBC-304C-42B3-810D-0AC8BEAE29D1@apple.com>
 <CAJQtwF08Apihdsitw2Vs-+iV9rrqgpqkwONSbaY-yb_BP1xCYw@mail.gmail.com>
 <33e37977-2d27-36a0-89a6-36e513d06b2f@ipxe.org>
 <6F69BEA6-5B7A-42E5-B6DA-D819ECC85EE5@apple.com>
 <CAJQtwF1zbeO=7bMq7KLMQLtwGKhA0MNL-qnRtobtp+TaCjG_2A@mail.gmail.com>
 <20210416113447.GG1664@vanye> <10E3436C-D743-4B2F-8E4B-7AD93B82FC92@apple.com>
 <CAJQtwF01S3XhywFzXg+ATR5Xf2qR4HNLN579D462BkOg+w5fCw@mail.gmail.com>
 <ca86fba8-ed47-e8df-b248-aa683e4f39e8@posteo.de>
 <7459B8C0-EDF0-4760-97E7-D3338312B3DF@apple.com>
 <9b5f25d9-065b-257d-1d2d-7f80d14dec64@posteo.de>
 <CAJQtwF1x9n6rGtJNe-d1QYPQzjKn=nhphhqqNPzoEZ_FVK4+eg@mail.gmail.com>
 <F09416EC-D2A3-4DE4-9809-7A34C68A05C1@apple.com>
 <6c0a4bf5-482e-b4f2-5df4-74930f4d979c@posteo.de>
 <7f129be1-d8b9-bc4f-958f-21e5ac6cc3d9@posteo.de>
X-Mailer: Apple Mail (2.3654.20.0.2.1)
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.761
 definitions=2021-04-18_14:2021-04-16,2021-04-18 signatures=0
Content-type: multipart/alternative;
 boundary="Apple-Mail=_7356BD0C-473A-494E-9EE4-A226AB7F9F79"

--Apple-Mail=_7356BD0C-473A-494E-9EE4-A226AB7F9F79
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8


> On Apr 18, 2021, at 12:22 PM, Marvin H=C3=A4user <mhaeuser@posteo.de> wr=
ote:
>=20
> On 18.04.21 21:11, Marvin H=C3=A4user wrote:
>> On 18.04.21 17:22, Andrew Fish via groups.io wrote:
>>>=20
>>>=20
>>>> On Apr 18, 2021, at 1:55 AM, Ethin Probst <harlydavidsen@gmail.com <m=
ailto:harlydavidsen@gmail.com>> wrote:
>>>>=20
>>>>> I think it would be best to sketch use-cases for audio and design th=
e solutions closely to the requirements. Why do we need to know when audio =
finished? What will happen when we queue audio twice? There are many layers=
 (UX, interface, implementation details) of questions to coming up with a p=
leasant and stable design.
>>>=20
>>> We are not using EFI to listen to music in the background. Any audio b=
eing played is part of a UI element and there might be synchronization requ=
irements.
>>=20
>> Maybe I communicated that wrong, I'm not asking because I don't know wh=
at audio is used for, I am saying ideally there is a written-down list of u=
sage requirements before the protocol is designed, because that is what the=
 design targets. The details should follow the needs.
>>=20
>>>=20
>>> For example playing a boot bong on boot you want that to be asynchrono=
us as you don=E2=80=99t want to delay boot to play the sound, but you may w=
ant to chose to gate some UI elements on the boot bong completing. If you a=
re building a menu UI that is accessible you may need to synchronize playba=
ck with UI update, but you may not want to make the slow sound playback blo=
cking as you can get other UI work done in parallel.
>>>=20
>>> The overhead for a caller making an async call is not much [1], but no=
t having the capability could really restrict the API for its intended use.=
 I=E2=80=99d also point out we picked the same pattern as the async BlockIO=
 and there is something to said for having consistency in the UEFI Spec and=
 have similar APIs work in similar ways.
>=20
> Sorry a lot of the spam, but I somehow missed the "consistency" point. S=
orry, but there seems to be no real consistency. Block I/O and network thin=
gs generally use the token event method (what is suggested here), while USB=
, Bluetooth, and such generally pass a callback function directly (not nece=
ssarily what I suggest, as I don't know the full requirements, but certainl=
y one way).
>=20

The networking interfaces are ancient, and it looks like the recent BT sta=
ck leans into that model. Also those callbacks are about returning data in =
the form of unknown packets which is not the problem we are trying to solve=
.=20

The async Block IO is more of a model of notification of a queued event an=
d I think that maps better into what we are doing. The MP Services protocol=
 from the PI spec also uses an optional event to notify completion. At some=
 point we realized with a callback, or just and event it was hard to return=
 an error status so that is why we ended up with the token in Block IO 2 so=
 a states could be returned after the completion event was signaled.=20

There is also more flexibility with events as it lets define a GUID=E2=80=
=99ed event and broadcast this state to other entities if you want.=20

Thanks,

Andrew Fish

> Best regards,
> Marvin
>=20
>>=20
>> I'm not saying there should be *no* async playback, I am saying it may =
be worth considering implementing it differently from caller-owned events. =
I'm not concerned with overhead, I'm concerned with points of failure (e.g.=
 leaks).
>>=20
>> I very briefly discussed some things with Ethin and it seems like the d=
efault EDK II timer interval of 10 ms may be problematic, but I am not sure=
. Just leaving it here as something to keep it mind.
>>=20
>> Best regards,
>> Marvin
>>=20
>>>=20
>>> [1] Overhead for making an asynchronous call.
>>> AUDIO_TOKEN AudioToken;
>>> gBS->CreateEvent  (EVT_NOTIFY_SIGNAL, TPL_CALLBACK, NULL, NULL, &Audio=
Token.Event);
>>>=20
>>> Thanks,
>>>=20
>>> Andrew Fish
>>>=20
>>>> I would be happy to discuss this with you on the UEFI talkbox. I'm
>>>> draeand on there.
>>>> As for your questions:
>>>>=20
>>>> 1. The only reason I recommend using an event to signal audio
>>>> completion is because I do not want this protocol to be blocking at
>>>> all. (So, perhaps removing the token entirely is a good idea.) The
>>>> VirtIO audio device says nothing about synchronization, but I imagine
>>>> its asynchronous because every audio specification I've seen out ther=
e
>>>> is asynchronous. Similarly, every audio API in existence -- at least,
>>>> every low-level OS-specific one -- is asynchronous/non-blocking.
>>>> (Usually, audio processing is handled on a separate thread.) However,
>>>> UEFI has no concept of threads or processes. Though we could use the
>>>> MP PI package to spin up a separate processor, that would fail on
>>>> uniprocessor, unicore systems. Audio processing needs a high enough
>>>> priority that it gets first in the list of tasks served while
>>>> simultaneously not getting a priority that's so high that it blocks
>>>> everything else. This is primarily because of the way an audio
>>>> subsystem is designed and the way an audio device functions: the audi=
o
>>>> subsystem needs to know, immediately, when the audio buffer has ran
>>>> out of samples and needs more, and it needs to react immediately to
>>>> refill the buffer if required, especially when streaming large amount=
s
>>>> of audio (e.g.: music). Similarly, the audio subsystem needs the
>>>> ability to react as soon as is viable when playback is requested,
>>>> because any significant delay will be noticeable by the end-user. In
>>>> more complex systems like FMOD or OpenAL, the audio processing thread
>>>> also needs a high priority to ensure that audio effects, positioning
>>>> information, dithering, etc., can be configured immediately because
>>>> the user will notice if any glitches or delays occur. The UEFI audio
>>>> protocols obviously will be nowhere near as complex, or as advanced,
>>>> because no one will need audio effects in a preboot environment.
>>>> Granted, its possible to make small audio effects, for example delays=
,
>>>> even if the protocol doesn't have functions to do that, but if an
>>>> end-user wants to go absolutely crazy with the audio samples and mix
>>>> in a really nice-sounding reverb or audio filter before sending the
>>>> samples to the audio engine, well, that's what they want to do and
>>>> that's out of our hands as driver/protocol developers. But I digress.
>>>> UEFI only has four TPLs, and so what we hopefully want is an engine
>>>> that is able to manage sample buffering and transmission, but also
>>>> doesn't block the application that's using the protocol. For some
>>>> things, blocking might be acceptable, but for speech synthesis or the
>>>> playing of startup sounds, this would not be an acceptable result and
>>>> would make the protocol pretty much worthless in the majority of
>>>> scenarios. So that's why I had an event to signal audio completion --
>>>> it was (perhaps) a cheap hack around the cooperatively-scheduled task
>>>> architecture of UEFI. (At least, I think its cooperative multitasking=
,
>>>> correct me if I'm wrong.)
>>>> 2. The VirtIO specification does not specify what occurs in the event
>>>> that a request is received to play a stream that's already being
>>>> played. However, it does provide enough information for extrapolation=
.
>>>> Every request that's sent to a VirtIO sound device must come with two
>>>> things: a stream ID and a buffer of samples. The sample data must
>>>> immediately follow the request. Therefore, for VirtIO in particular,
>>>> the device will simply stop playing the old set of samples and play
>>>> the new set instead. This goes along with what I've seen in other
>>>> specifications like the HDA one: unless the device in question
>>>> supports more than one stream, it is impossible to play two sounds on
>>>> a single stream simultaneously, and an HDA controller (for example) i=
s
>>>> not going to perform any mixing; mixing is done purely in software.
>>>> Similarly, if a device does support multiple streams, it is
>>>> unspecified whether the device will play two or more streams
>>>> simultaneously or whether it will pause/abort the playback of one
>>>> while it plays another. Therefore, I believe (though cannot confirm)
>>>> that OSes like Windows simply use a single stream, even if the device
>>>> supports multiple streams, and just makes the applications believe
>>>> that unlimited streams are possible.
>>>>=20
>>>> I apologize for this really long-winded email, and I hope no one mind=
s. :-)
>>>>=20
>>>> On 4/17/21, Marvin H=C3=A4user <mhaeuser@posteo.de <mailto:mhaeuser@p=
osteo.de>> wrote:
>>>>> On 17.04.21 19:31, Andrew Fish via groups.io <http://groups.io> wrot=
e:
>>>>>>=20
>>>>>>=20
>>>>>>> On Apr 17, 2021, at 9:51 AM, Marvin H=C3=A4user <mhaeuser@posteo.d=
e <mailto:mhaeuser@posteo.de>
>>>>>>> <mailto:mhaeuser@posteo.de <mailto:mhaeuser@posteo.de>>> wrote:
>>>>>>>=20
>>>>>>> On 16.04.21 19:45, Ethin Probst wrote:
>>>>>>>> Yes, three APIs (maybe like this) would work well:
>>>>>>>> - Start, Stop: begin playback of a stream
>>>>>>>> - SetVolume, GetVolume, Mute, Unmute: control volume of output an=
d
>>>>>>>> enable muting
>>>>>>>> - CreateStream, ReleaseStream, SetStreamSampleRate: Control sampl=
e
>>>>>>>> rate of stream (but not sample format since Signed 16-bit PCM is
>>>>>>>> enough)
>>>>>>>> Marvin, how do you suggest we make the events then? We need some =
way
>>>>>>>> of notifying the caller that the stream has concluded. We could m=
ake
>>>>>>>> the driver create the event and pass it back to the caller as an
>>>>>>>> event, but you'd still have dangling pointers (this is C, after a=
ll).
>>>>>>>> We could just make a IsPlaying() function and WaitForCompletion()
>>>>>>>> function and allow the driver to do the event handling -- would t=
hat
>>>>>>>> work?
>>>>>>>=20
>>>>>>> I do not know enough about the possible use-cases to tell. Aside f=
rom
>>>>>>> the two functions you already mentioned, you could also take in an
>>>>>>> (optional) notification function.
>>>>>>> Which possible use-cases does determining playback end have? If it=
's
>>>>>>> too much effort, just use EFI_EVENT I guess, just the less code ca=
n
>>>>>>> mess it up, the better.
>>>>>>>=20
>>>>>>=20
>>>>>> In UEFI EFI_EVENT works much better. There is a gBS-WaitForEvent()
>>>>>> function that lets a caller wait on an event. That is basically wha=
t
>>>>>> the UEFI Shell is doing at the Shell prompt. A GUI in UEFI/C is
>>>>>> basically an event loop.
>>>>>>=20
>>>>>> Fun fact: I ended up adding gIdleLoopEventGuid to the MdeModulePkg =
so
>>>>>> the DXE Core could signal gIdleLoopEventGuid if you are sitting in
>>>>>> gBS-WaitForEvent() and no event is signaled. Basically in EFI nothi=
ng
>>>>>> is going to happen until the next timer tick so the gIdleLoopEventG=
uid
>>>>>> lets you idle the CPU until the next timer tick. I was forced to do
>>>>>> this as the 1st MacBook Air had a bad habit of thermal tripping whe=
n
>>>>>> sitting at the UEFI Shell prompt. After all another name for a loop=
 in
>>>>>> C code running on bare metal is a power virus.
>>>>>=20
>>>>> Mac EFI is one of the best implementations we know of, frankly. I'm
>>>>> traumatised by Aptio 4 and alike, where (some issues are OEM-specifi=
c I
>>>>> think) you can have timer events signalling after ExitBS, there is e=
vent
>>>>> clutter on IO polling to the point where everything lags no matter w=
hat
>>>>> you do, and even in "smooth" scenarios there may be nothing worth th=
e
>>>>> description "granularity" (events scheduled to run every 10 ms may r=
un
>>>>> every 50 ms). Events are the last resort for us, if there really is =
no
>>>>> other way. My first GUI implementation worked without events at all =
for
>>>>> this reason, but as our workarounds got better, we did start using t=
hem
>>>>> for keyboard and mouse polling.
>>>>>=20
>>>>> Timers do not apply here, but what does apply is resource management=
.
>>>>> Using EFI_EVENT directly means (to the outside) the introduction of =
a
>>>>> new resource to maintain, for each caller separately. On the other s=
ide,
>>>>> there is no resource to misuse or leak if none such is exposed. Yet,=
 if
>>>>> you argue with APIs like WaitForEvent, something has to signal it. I=
n a
>>>>> simple environment this would mean, some timer event is running and =
may
>>>>> signal the event the main code waits for, where above's concern actu=
ally
>>>>> do apply. :) Again, the recommendation assumes the use-cases are sim=
ple
>>>>> enough to easily avoid them.
>>>>>=20
>>>>> I think it would be best to sketch use-cases for audio and design th=
e
>>>>> solutions closely to the requirements. Why do we need to know when a=
udio
>>>>> finished? What will happen when we queue audio twice? There are many
>>>>> layers (UX, interface, implementation details) of questions to comin=
g up
>>>>> with a pleasant and stable design.
>>>>>=20
>>>>> Best regards,
>>>>> Marvin
>>>>>=20
>>>>>>=20
>>>>>> Thanks,
>>>>>>=20
>>>>>> Andrew Fish.
>>>>>>=20
>>>>>>> If I remember correctly you mentioned the UEFI Talkbox before, if
>>>>>>> that is more convenient for you, I'm there as mhaeuser.
>>>>>>>=20
>>>>>>> Best regards,
>>>>>>> Marvin
>>>>>>>=20
>>>>>>>>=20
>>>>>>>> On 4/16/21, Andrew Fish <afish@apple.com <mailto:afish@apple.com>=
 <mailto:afish@apple.com <mailto:afish@apple.com>>>
>>>>>>>> wrote:
>>>>>>>>>=20
>>>>>>>>>> On Apr 16, 2021, at 4:34 AM, Leif Lindholm <leif@nuviainc.com <=
mailto:leif@nuviainc.com>
>>>>>>>>>> <mailto:leif@nuviainc.com <mailto:leif@nuviainc.com>>> wrote:
>>>>>>>>>>=20
>>>>>>>>>> Hi Ethin,
>>>>>>>>>>=20
>>>>>>>>>> I think we also want to have a SetMode function, even if we don=
't get
>>>>>>>>>> around to implement proper support for it as part of GSoC (alth=
ough I
>>>>>>>>>> expect at least for virtio, that should be pretty straightforwa=
rd).
>>>>>>>>>>=20
>>>>>>>>> Leif,
>>>>>>>>>=20
>>>>>>>>> I=E2=80=99m think if we have an API to load the buffer and a 2nd=
 API to
>>>>>>>>> play the
>>>>>>>>> buffer an optional 3rd API could configure the streams.
>>>>>>>>>=20
>>>>>>>>>> It's quite likely that speech for UI would be stored as 8kHz (o=
r
>>>>>>>>>> 20kHz) in some systems, whereas the example for playing a tune =
in
>>>>>>>>>> GRUB
>>>>>>>>>> would more likely be a 44.1 kHz mp3/wav/ogg/flac.
>>>>>>>>>>=20
>>>>>>>>>> For the GSoC project, I think it would be quite reasonable to
>>>>>>>>>> pre-generate pure PCM streams for testing rather than decoding
>>>>>>>>>> anything on the fly.
>>>>>>>>>>=20
>>>>>>>>>> Porting/writing decoders is really a separate task from enablin=
g the
>>>>>>>>>> output. I would much rather see USB *and* HDA support able to p=
lay
>>>>>>>>>> pcm
>>>>>>>>>> streams before worrying about decoding.
>>>>>>>>>>=20
>>>>>>>>> I agree it might turn out it is easier to have the text to speec=
h
>>>>>>>>> code just
>>>>>>>>> encode a PCM directly.
>>>>>>>>>=20
>>>>>>>>> Thanks,
>>>>>>>>>=20
>>>>>>>>> Andrew Fish
>>>>>>>>>=20
>>>>>>>>>> /
>>>>>>>>>>    Leif
>>>>>>>>>>=20
>>>>>>>>>> On Fri, Apr 16, 2021 at 00:33:06 -0500, Ethin Probst wrote:
>>>>>>>>>>> Thanks for that explanation (I missed Mike's message). Earlier=
 I
>>>>>>>>>>> sent
>>>>>>>>>>> a summary of those things that we can agree on: mainly, that w=
e have
>>>>>>>>>>> mute, volume control, a load buffer, (maybe) an unload buffer,=
 and a
>>>>>>>>>>> start/stop stream function. Now that I fully understand the
>>>>>>>>>>> ramifications of this I don't mind settling for a specific for=
mat
>>>>>>>>>>> and
>>>>>>>>>>> sample rate, and signed 16-bit PCM audio is, I think, the most
>>>>>>>>>>> widely
>>>>>>>>>>> used one out there, besides 64-bit floating point samples, whi=
ch
>>>>>>>>>>> I've
>>>>>>>>>>> only seen used in DAWs, and that's something we don't need.
>>>>>>>>>>> Are you sure you want the firmware itself to handle the decodi=
ng of
>>>>>>>>>>> WAV audio? I can make a library class for that, but I'll defin=
itely
>>>>>>>>>>> need help with the security aspect.
>>>>>>>>>>>=20
>>>>>>>>>>> On 4/16/21, Andrew Fish via groups.io <http://groups.io> <http=
://groups.io <http://groups.io>>
>>>>>>>>>>> <afish=3Dapple.com@groups.io <mailto:afish=3Dapple.com@groups.=
io> <mailto:afish=3Dapple.com@groups.io <mailto:afish=3Dapple.com@groups.io=
>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>=20
>>>>>>>>>>>>> On Apr 15, 2021, at 5:59 PM, Michael Brown <mcb30@ipxe.org <=
mailto:mcb30@ipxe.org>
>>>>>>>>>>>>> <mailto:mcb30@ipxe.org <mailto:mcb30@ipxe.org>>> wrote:
>>>>>>>>>>>>>=20
>>>>>>>>>>>>> On 16/04/2021 00:42, Ethin Probst wrote:
>>>>>>>>>>>>>> Forcing a particular channel mapping, sample rate and sampl=
e
>>>>>>>>>>>>>> format
>>>>>>>>>>>>>> on
>>>>>>>>>>>>>> everyone would complicate application code. From an
>>>>>>>>>>>>>> application point
>>>>>>>>>>>>>> of view, one would, with that type of protocol, need to do =
the
>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>> 1) Load an audio file in any audio file format from any sto=
rage
>>>>>>>>>>>>>> mechanism.
>>>>>>>>>>>>>> 2) Decode the audio file format to extract the samples and =
audio
>>>>>>>>>>>>>> metadata.
>>>>>>>>>>>>>> 3) Resample the (now decoded) audio samples and convert
>>>>>>>>>>>>>> (quantize)
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> audio samples into signed 16-bit PCM audio.
>>>>>>>>>>>>>> 4) forward the samples onto the EFI audio protocol.
>>>>>>>>>>>>> You have made an incorrect assumption that there exists a
>>>>>>>>>>>>> requirement
>>>>>>>>>>>>> to
>>>>>>>>>>>>> be able to play audio files in arbitrary formats.  This
>>>>>>>>>>>>> requirement
>>>>>>>>>>>>> does
>>>>>>>>>>>>> not exist.
>>>>>>>>>>>>>=20
>>>>>>>>>>>>> With a protocol-mandated fixed baseline set of audio paramet=
ers
>>>>>>>>>>>>> (sample
>>>>>>>>>>>>> rate etc), what would happen in practice is that the audio
>>>>>>>>>>>>> files would
>>>>>>>>>>>>> be
>>>>>>>>>>>>> encoded in that format at *build* time, using tools entirely
>>>>>>>>>>>>> external
>>>>>>>>>>>>> to
>>>>>>>>>>>>> UEFI.  The application code is then trivially simple: it jus=
t does
>>>>>>>>>>>>> "load
>>>>>>>>>>>>> blob, pass blob to audio protocol".
>>>>>>>>>>>>>=20
>>>>>>>>>>>>=20
>>>>>>>>>>>> Ethin,
>>>>>>>>>>>>=20
>>>>>>>>>>>> Given the goal is an industry standard we value interoperabil=
ity
>>>>>>>>>>>> more
>>>>>>>>>>>> that
>>>>>>>>>>>> flexibility.
>>>>>>>>>>>>=20
>>>>>>>>>>>> How about another use case. Lets say the Linux OS loader (Gru=
b)
>>>>>>>>>>>> wants
>>>>>>>>>>>> to
>>>>>>>>>>>> have an accessible UI so it decides to sore sound files on th=
e EFI
>>>>>>>>>>>> System
>>>>>>>>>>>> Partition and use our new fancy UEFI Audio Protocol to add au=
dio
>>>>>>>>>>>> to the
>>>>>>>>>>>> OS
>>>>>>>>>>>> loader GUI. So that version of Grub needs to work on 1,000 of
>>>>>>>>>>>> different
>>>>>>>>>>>> PCs
>>>>>>>>>>>> and a wide range of UEFI Audio driver implementations. It is =
a much
>>>>>>>>>>>> easier
>>>>>>>>>>>> world if Wave PCM 16 bit just works every place. You could ad=
d a
>>>>>>>>>>>> lot of
>>>>>>>>>>>> complexity and try to encode the audio on the fly, maybe even=
 in
>>>>>>>>>>>> Linux
>>>>>>>>>>>> proper but that falls down if you are booting from read only
>>>>>>>>>>>> media like
>>>>>>>>>>>> a
>>>>>>>>>>>> DVD or backup tape (yes people still do that in server land).
>>>>>>>>>>>>=20
>>>>>>>>>>>> The other problem with flexibility is you just made the test =
matrix
>>>>>>>>>>>> very
>>>>>>>>>>>> large for every driver that needs to get implemented. For
>>>>>>>>>>>> something as
>>>>>>>>>>>> complex as Intel HDA how you hook up the hardware and what
>>>>>>>>>>>> CODECs you
>>>>>>>>>>>> use
>>>>>>>>>>>> may impact the quality of the playback for a given board. You=
r
>>>>>>>>>>>> EFI is
>>>>>>>>>>>> likely
>>>>>>>>>>>> going to pick a single encoding at that will get tested all t=
he
>>>>>>>>>>>> time if
>>>>>>>>>>>> your
>>>>>>>>>>>> system has audio, but all 50 other things you support not so
>>>>>>>>>>>> much. So
>>>>>>>>>>>> that
>>>>>>>>>>>> will required testing, and some one with audiophile ears (or =
an AI
>>>>>>>>>>>> program)
>>>>>>>>>>>> to test all the combinations. I=E2=80=99m not kidding I get B=
Zs on the
>>>>>>>>>>>> quality
>>>>>>>>>>>> of
>>>>>>>>>>>> the boot bong on our systems.
>>>>>>>>>>>>=20
>>>>>>>>>>>>=20
>>>>>>>>>>>>>> typedef struct EFI_SIMPLE_AUDIO_PROTOCOL {
>>>>>>>>>>>>>>  EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset;
>>>>>>>>>>>>>>  EFI_SIMPLE_AUDIO_PROTOCOL_START Start;
>>>>>>>>>>>>>>  EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop;
>>>>>>>>>>>>>> } EFI_SIMPLE_AUDIO_PROTOCOL;
>>>>>>>>>>>>> This is now starting to look like something that belongs in
>>>>>>>>>>>>> boot-time
>>>>>>>>>>>>> firmware.  :)
>>>>>>>>>>>>>=20
>>>>>>>>>>>> I think that got a little too simple I=E2=80=99d go back and =
look at the
>>>>>>>>>>>> example
>>>>>>>>>>>> I
>>>>>>>>>>>> posted to the thread but add an API to load the buffer, and t=
hen
>>>>>>>>>>>> play
>>>>>>>>>>>> the
>>>>>>>>>>>> buffer (that way we can an API in the future to twiddle knobs=
).
>>>>>>>>>>>> That
>>>>>>>>>>>> API
>>>>>>>>>>>> also implements the async EFI interface. Trust me the 1st thi=
ng
>>>>>>>>>>>> that is
>>>>>>>>>>>> going to happen when we add audio is some one is going to
>>>>>>>>>>>> complain in
>>>>>>>>>>>> xyz
>>>>>>>>>>>> state we should mute audio, or we should honer audio volume a=
nd
>>>>>>>>>>>> mute
>>>>>>>>>>>> settings from setup, or from values set in the OS. Or some on=
e
>>>>>>>>>>>> is going
>>>>>>>>>>>> to
>>>>>>>>>>>> want the volume keys on the keyboard to work in EFI.
>>>>>>>>>>>>=20
>>>>>>>>>>>> Also if you need to pick apart the Wave PCM 16 byte file to f=
eed
>>>>>>>>>>>> it into
>>>>>>>>>>>> the
>>>>>>>>>>>> audio hardware that probably means we should have a library t=
hat
>>>>>>>>>>>> does
>>>>>>>>>>>> that
>>>>>>>>>>>> work, so other Audio drivers can share that code. Also having=
 a
>>>>>>>>>>>> library
>>>>>>>>>>>> makes it easier to write a unit test. We need to be security
>>>>>>>>>>>> conscious
>>>>>>>>>>>> as we
>>>>>>>>>>>> need to treat the Audo file as attacker controlled data.
>>>>>>>>>>>>=20
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>=20
>>>>>>>>>>>> Andrew Fish
>>>>>>>>>>>>=20
>>>>>>>>>>>>> Michael
>>>>>>>>>>>>>=20
>>>>>>>>>>>>>=20
>>>>>>>>>>>>>=20
>>>>>>>>>>>>>=20
>>>>>>>>>>>>>=20
>>>>>>>>>>>>=20
>>>>>>>>>>>>=20
>>>>>>>>>>>>=20
>>>>>>>>>>>>=20
>>>>>>>>>>>>=20
>>>>>>>>>>>>=20
>>>>>>>>>>>=20
>>>>>>>>>>> --=20
>>>>>>>>>>> Signed,
>>>>>>>>>>> Ethin D. Probst
>>>>>>>>>>=20
>>>>>>>>>>=20
>>>>>>>>>>=20
>>>>>>>>>>=20
>>>>>>>>>=20
>>>>>>>>=20
>>>>>>>=20
>>>>>>>=20
>>>>>>>=20
>>>>>=20
>>>>>=20
>>>>=20
>>>>=20
>>>> --=20
>>>> Signed,
>>>> Ethin D. Probst
>>>=20
>>=20
>=20
>=20
>=20
>=20


--Apple-Mail=_7356BD0C-473A-494E-9EE4-A226AB7F9F79
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html; charset=
=
=3Dutf-8"></head><body style=3D"word-wrap: break-word; -webkit-nbsp-mode: =
space; line-break: after-white-space;" class=3D""><br class=3D""><div><br c=
lass=3D""><blockquote type=3D"cite" class=3D""><div class=3D"">On Apr 18, 2=
021, at 12:22 PM, Marvin H=C3=A4user &lt;<a href=3D"mailto:mhaeuser@posteo.=
de" class=3D"">mhaeuser@posteo.de</a>&gt; wrote:</div><br class=3D"Apple-in=
terchange-newline"><div class=3D""><meta charset=3D"UTF-8" class=3D""><span=
 style=3D"caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12p=
x; font-style: normal; font-variant-caps: normal; font-weight: normal; lett=
er-spacing: normal; text-align: start; text-indent: 0px; text-transform: no=
ne; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;=
 text-decoration: none; float: none; display: inline !important;" class=3D"=
">On 18.04.21 21:11, Marvin H=C3=A4user wrote:</span><br style=3D"caret-col=
or: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: norm=
al; font-variant-caps: normal; font-weight: normal; letter-spacing: normal;=
 text-align: start; text-indent: 0px; text-transform: none; white-space: no=
rmal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: n=
one;" class=3D""><blockquote type=3D"cite" style=3D"font-family: Helvetica;=
 font-size: 12px; font-style: normal; font-variant-caps: normal; font-weigh=
t: normal; letter-spacing: normal; orphans: auto; text-align: start; text-i=
ndent: 0px; text-transform: none; white-space: normal; widows: auto; word-s=
pacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px=
; text-decoration: none;" class=3D"">On 18.04.21 17:22, Andrew Fish via <a =
href=3D"http://groups.io" class=3D"">groups.io</a> wrote:<br class=3D""><bl=
ockquote type=3D"cite" class=3D""><br class=3D""><br class=3D""><blockquote=
 type=3D"cite" class=3D"">On Apr 18, 2021, at 1:55 AM, Ethin Probst &lt;<a =
href=3D"mailto:harlydavidsen@gmail.com" class=3D"">harlydavidsen@gmail.com<=
/a> &lt;<a href=3D"mailto:harlydavidsen@gmail.com" class=3D"">mailto:harlyd=
avidsen@gmail.com</a>&gt;&gt; wrote:<br class=3D""><br class=3D""><blockquo=
te type=3D"cite" class=3D"">I think it would be best to sketch use-cases fo=
r audio and design the solutions closely to the requirements. Why do we nee=
d to know when audio finished? What will happen when we queue audio twice? =
There are many layers (UX, interface, implementation details) of questions =
to coming up with a pleasant and stable design.<br class=3D""></blockquote>=
</blockquote><br class=3D"">We are not using EFI to listen to music in the =
background. Any audio being played is part of a UI element and there might =
be synchronization requirements.<br class=3D""></blockquote><br class=3D"">=
Maybe I communicated that wrong, I'm not asking because I don't know what a=
udio is used for, I am saying ideally there is a written-down list of usage=
 requirements before the protocol is designed, because that is what the des=
ign targets. The details should follow the needs.<br class=3D""><br class=
=3D""><blockquote type=3D"cite" class=3D""><br class=3D"">For example play=
ing a boot bong on boot you want that to be asynchronous as you don=E2=80=
=99t want to delay boot to play the sound, but you may want to chose to ga=
te some UI elements on the boot bong completing. If you are building a menu=
 UI that is accessible you may need to synchronize playback with UI update,=
 but you may not want to make the slow sound playback blocking as you can g=
et other UI work done in parallel.<br class=3D""><br class=3D"">The overhea=
d for a caller making an async call is not much [1], but not having the cap=
ability could really restrict the API for its intended use. I=E2=80=99d als=
o point out we picked the same pattern as the async BlockIO and there is so=
mething to said for having consistency in the UEFI Spec and have similar AP=
Is work in similar ways.<br class=3D""></blockquote></blockquote><br style=
=3D"caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; fo=
nt-style: normal; font-variant-caps: normal; font-weight: normal; letter-sp=
acing: normal; text-align: start; text-indent: 0px; text-transform: none; w=
hite-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text=
-decoration: none;" class=3D""><span style=3D"caret-color: rgb(0, 0, 0); fo=
nt-family: Helvetica; font-size: 12px; font-style: normal; font-variant-cap=
s: normal; font-weight: normal; letter-spacing: normal; text-align: start; =
text-indent: 0px; text-transform: none; white-space: normal; word-spacing: =
0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; di=
splay: inline !important;" class=3D"">Sorry a lot of the spam, but I someho=
w missed the "consistency" point. Sorry, but there seems to be no real cons=
istency. Block I/O and network things generally use the token event method =
(what is suggested here), while USB, Bluetooth, and such generally pass a c=
allback function directly (not necessarily what I suggest, as I don't know =
the full requirements, but certainly one way).</span><br style=3D"caret-col=
or: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: norm=
al; font-variant-caps: normal; font-weight: normal; letter-spacing: normal;=
 text-align: start; text-indent: 0px; text-transform: none; white-space: no=
rmal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: n=
one;" class=3D""><br style=3D"caret-color: rgb(0, 0, 0); font-family: Helve=
tica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-=
weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px=
; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-tex=
t-stroke-width: 0px; text-decoration: none;" class=3D""></div></blockquote>=
<div><br class=3D""></div><div>The networking interfaces are ancient, and i=
t looks like the recent BT stack leans into that model. Also those callback=
s are about returning data in the form of unknown packets which is not the =
problem we are trying to solve.&nbsp;</div><div><br class=3D""></div><div>T=
he async Block IO is more of a model of notification of a queued event and =
I think that maps better into what we are doing. The MP Services protocol f=
rom the PI spec also uses an optional event to notify completion. At some p=
oint we realized with a callback, or just and event it was hard to return a=
n error status so that is why we ended up with the token in Block IO 2 so a=
 states could be returned after the completion event was signaled.&nbsp;</d=
iv><div><br class=3D""></div><div>There is also more flexibility with event=
s as it lets define a GUID=E2=80=99ed event and broadcast this state to oth=
er entities if you want.&nbsp;</div><div><br class=3D""></div><div>Thanks,<=
/div><div><br class=3D""></div><div>Andrew Fish</div><br class=3D""><blockq=
uote type=3D"cite" class=3D""><div class=3D""><span style=3D"caret-color: r=
gb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; f=
ont-variant-caps: normal; font-weight: normal; letter-spacing: normal; text=
-align: start; text-indent: 0px; text-transform: none; white-space: normal;=
 word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; =
float: none; display: inline !important;" class=3D"">Best regards,</span><b=
r style=3D"caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12=
px; font-style: normal; font-variant-caps: normal; font-weight: normal; let=
ter-spacing: normal; text-align: start; text-indent: 0px; text-transform: n=
one; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px=
; text-decoration: none;" class=3D""><span style=3D"caret-color: rgb(0, 0, =
0); font-family: Helvetica; font-size: 12px; font-style: normal; font-varia=
nt-caps: normal; font-weight: normal; letter-spacing: normal; text-align: s=
tart; text-indent: 0px; text-transform: none; white-space: normal; word-spa=
cing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: no=
ne; display: inline !important;" class=3D"">Marvin</span><br style=3D"caret=
-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: =
normal; font-variant-caps: normal; font-weight: normal; letter-spacing: nor=
mal; text-align: start; text-indent: 0px; text-transform: none; white-space=
: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoratio=
n: none;" class=3D""><br style=3D"caret-color: rgb(0, 0, 0); font-family: H=
elvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; f=
ont-weight: normal; letter-spacing: normal; text-align: start; text-indent:=
 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit=
-text-stroke-width: 0px; text-decoration: none;" class=3D""><blockquote typ=
e=3D"cite" style=3D"font-family: Helvetica; font-size: 12px; font-style: no=
rmal; font-variant-caps: normal; font-weight: normal; letter-spacing: norma=
l; orphans: auto; text-align: start; text-indent: 0px; text-transform: none=
; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-a=
djust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;" class=
=3D""><br class=3D"">I'm not saying there should be *no* async playback, I=
 am saying it may be worth considering implementing it differently from cal=
ler-owned events. I'm not concerned with overhead, I'm concerned with point=
s of failure (e.g. leaks).<br class=3D""><br class=3D"">I very briefly disc=
ussed some things with Ethin and it seems like the default EDK II timer int=
erval of 10 ms may be problematic, but I am not sure. Just leaving it here =
as something to keep it mind.<br class=3D""><br class=3D"">Best regards,<br=
 class=3D"">Marvin<br class=3D""><br class=3D""><blockquote type=3D"cite" c=
lass=3D""><br class=3D"">[1] Overhead for making an asynchronous call.<br c=
lass=3D"">AUDIO_TOKEN AudioToken;<br class=3D"">gBS-&gt;CreateEvent &nbsp;(=
EVT_NOTIFY_SIGNAL, TPL_CALLBACK, NULL, NULL, &amp;AudioToken.Event);<br cla=
ss=3D""><br class=3D"">Thanks,<br class=3D""><br class=3D"">Andrew Fish<br =
class=3D""><br class=3D""><blockquote type=3D"cite" class=3D"">I would be h=
appy to discuss this with you on the UEFI talkbox. I'm<br class=3D"">draean=
d on there.<br class=3D"">As for your questions:<br class=3D""><br class=3D=
"">1. The only reason I recommend using an event to signal audio<br class=
=3D"">completion is because I do not want this protocol to be blocking at<=
br class=3D"">all. (So, perhaps removing the token entirely is a good idea.=
) The<br class=3D"">VirtIO audio device says nothing about synchronization,=
 but I imagine<br class=3D"">its asynchronous because every audio specifica=
tion I've seen out there<br class=3D"">is asynchronous. Similarly, every au=
dio API in existence -- at least,<br class=3D"">every low-level OS-specific=
 one -- is asynchronous/non-blocking.<br class=3D"">(Usually, audio process=
ing is handled on a separate thread.) However,<br class=3D"">UEFI has no co=
ncept of threads or processes. Though we could use the<br class=3D"">MP PI =
package to spin up a separate processor, that would fail on<br class=3D"">u=
niprocessor, unicore systems. Audio processing needs a high enough<br class=
=
=3D"">priority that it gets first in the list of tasks served while<br cla=
ss=3D"">simultaneously not getting a priority that's so high that it blocks=
<br class=3D"">everything else. This is primarily because of the way an aud=
io<br class=3D"">subsystem is designed and the way an audio device function=
s: the audio<br class=3D"">subsystem needs to know, immediately, when the a=
udio buffer has ran<br class=3D"">out of samples and needs more, and it nee=
ds to react immediately to<br class=3D"">refill the buffer if required, esp=
ecially when streaming large amounts<br class=3D"">of audio (e.g.: music). =
Similarly, the audio subsystem needs the<br class=3D"">ability to react as =
soon as is viable when playback is requested,<br class=3D"">because any sig=
nificant delay will be noticeable by the end-user. In<br class=3D"">more co=
mplex systems like FMOD or OpenAL, the audio processing thread<br class=3D"=
">also needs a high priority to ensure that audio effects, positioning<br c=
lass=3D"">information, dithering, etc., can be configured immediately becau=
se<br class=3D"">the user will notice if any glitches or delays occur. The =
UEFI audio<br class=3D"">protocols obviously will be nowhere near as comple=
x, or as advanced,<br class=3D"">because no one will need audio effects in =
a preboot environment.<br class=3D"">Granted, its possible to make small au=
dio effects, for example delays,<br class=3D"">even if the protocol doesn't=
 have functions to do that, but if an<br class=3D"">end-user wants to go ab=
solutely crazy with the audio samples and mix<br class=3D"">in a really nic=
e-sounding reverb or audio filter before sending the<br class=3D"">samples =
to the audio engine, well, that's what they want to do and<br class=3D"">th=
at's out of our hands as driver/protocol developers. But I digress.<br clas=
s=3D"">UEFI only has four TPLs, and so what we hopefully want is an engine<=
br class=3D"">that is able to manage sample buffering and transmission, but=
 also<br class=3D"">doesn't block the application that's using the protocol=
. For some<br class=3D"">things, blocking might be acceptable, but for spee=
ch synthesis or the<br class=3D"">playing of startup sounds, this would not=
 be an acceptable result and<br class=3D"">would make the protocol pretty m=
uch worthless in the majority of<br class=3D"">scenarios. So that's why I h=
ad an event to signal audio completion --<br class=3D"">it was (perhaps) a =
cheap hack around the cooperatively-scheduled task<br class=3D"">architectu=
re of UEFI. (At least, I think its cooperative multitasking,<br class=3D"">=
correct me if I'm wrong.)<br class=3D"">2. The VirtIO specification does no=
t specify what occurs in the event<br class=3D"">that a request is received=
 to play a stream that's already being<br class=3D"">played. However, it do=
es provide enough information for extrapolation.<br class=3D"">Every reques=
t that's sent to a VirtIO sound device must come with two<br class=3D"">thi=
ngs: a stream ID and a buffer of samples. The sample data must<br class=3D"=
">immediately follow the request. Therefore, for VirtIO in particular,<br c=
lass=3D"">the device will simply stop playing the old set of samples and pl=
ay<br class=3D"">the new set instead. This goes along with what I've seen i=
n other<br class=3D"">specifications like the HDA one: unless the device in=
 question<br class=3D"">supports more than one stream, it is impossible to =
play two sounds on<br class=3D"">a single stream simultaneously, and an HDA=
 controller (for example) is<br class=3D"">not going to perform any mixing;=
 mixing is done purely in software.<br class=3D"">Similarly, if a device do=
es support multiple streams, it is<br class=3D"">unspecified whether the de=
vice will play two or more streams<br class=3D"">simultaneously or whether =
it will pause/abort the playback of one<br class=3D"">while it plays anothe=
r. Therefore, I believe (though cannot confirm)<br class=3D"">that OSes lik=
e Windows simply use a single stream, even if the device<br class=3D"">supp=
orts multiple streams, and just makes the applications believe<br class=3D"=
">that unlimited streams are possible.<br class=3D""><br class=3D"">I apolo=
gize for this really long-winded email, and I hope no one minds. :-)<br cla=
ss=3D""><br class=3D"">On 4/17/21, Marvin H=C3=A4user &lt;<a href=3D"mailto=
:mhaeuser@posteo.de" class=3D"">mhaeuser@posteo.de</a> &lt;<a href=3D"mailt=
o:mhaeuser@posteo.de" class=3D"">mailto:mhaeuser@posteo.de</a>&gt;&gt; wrot=
e:<br class=3D""><blockquote type=3D"cite" class=3D"">On 17.04.21 19:31, An=
drew Fish via <a href=3D"http://groups.io" class=3D"">groups.io</a> &lt;<a =
href=3D"http://groups.io" class=3D"">http://groups.io</a>&gt; wrote:<br cla=
ss=3D""><blockquote type=3D"cite" class=3D""><br class=3D""><br class=3D"">=
<blockquote type=3D"cite" class=3D"">On Apr 17, 2021, at 9:51 AM, Marvin H=
=C3=A4user &lt;<a href=3D"mailto:mhaeuser@posteo.de" class=3D"">mhaeuser@p=
osteo.de</a> &lt;<a href=3D"mailto:mhaeuser@posteo.de" class=3D"">mailto:mh=
aeuser@posteo.de</a>&gt;<br class=3D"">&lt;<a href=3D"mailto:mhaeuser@poste=
o.de" class=3D"">mailto:mhaeuser@posteo.de</a> &lt;<a href=3D"mailto:mhaeus=
er@posteo.de" class=3D"">mailto:mhaeuser@posteo.de</a>&gt;&gt;&gt; wrote:<b=
r class=3D""><br class=3D"">On 16.04.21 19:45, Ethin Probst wrote:<br class=
=
=3D""><blockquote type=3D"cite" class=3D"">Yes, three APIs (maybe like thi=
s) would work well:<br class=3D"">- Start, Stop: begin playback of a stream=
<br class=3D"">- SetVolume, GetVolume, Mute, Unmute: control volume of outp=
ut and<br class=3D"">enable muting<br class=3D"">- CreateStream, ReleaseStr=
eam, SetStreamSampleRate: Control sample<br class=3D"">rate of stream (but =
not sample format since Signed 16-bit PCM is<br class=3D"">enough)<br class=
=
=3D"">Marvin, how do you suggest we make the events then? We need some way=
<br class=3D"">of notifying the caller that the stream has concluded. We co=
uld make<br class=3D"">the driver create the event and pass it back to the =
caller as an<br class=3D"">event, but you'd still have dangling pointers (t=
his is C, after all).<br class=3D"">We could just make a IsPlaying() functi=
on and WaitForCompletion()<br class=3D"">function and allow the driver to d=
o the event handling -- would that<br class=3D"">work?<br class=3D""></bloc=
kquote><br class=3D"">I do not know enough about the possible use-cases to =
tell. Aside from<br class=3D"">the two functions you already mentioned, you=
 could also take in an<br class=3D"">(optional) notification function.<br c=
lass=3D"">Which possible use-cases does determining playback end have? If i=
t's<br class=3D"">too much effort, just use EFI_EVENT I guess, just the les=
s code can<br class=3D"">mess it up, the better.<br class=3D""><br class=3D=
""></blockquote><br class=3D"">In UEFI EFI_EVENT works much better. There i=
s a gBS-WaitForEvent()<br class=3D"">function that lets a caller wait on an=
 event. That is basically what<br class=3D"">the UEFI Shell is doing at the=
 Shell prompt. A GUI in UEFI/C is<br class=3D"">basically an event loop.<br=
 class=3D""><br class=3D"">Fun fact: I ended up adding&nbsp;gIdleLoopEventG=
uid to the MdeModulePkg so<br class=3D"">the DXE Core could signal gIdleLoo=
pEventGuid if you are sitting in<br class=3D"">gBS-WaitForEvent() and no ev=
ent is signaled. Basically in EFI nothing<br class=3D"">is going to happen =
until the next timer tick so the gIdleLoopEventGuid<br class=3D"">lets you =
idle the CPU until the next timer tick. I was forced to do<br class=3D"">th=
is as the 1st MacBook Air had a bad habit of thermal tripping when<br class=
=
=3D"">sitting at the UEFI Shell prompt. After all another name for a loop =
in<br class=3D"">C code running on bare metal is a power virus.<br class=3D=
""></blockquote><br class=3D"">Mac EFI is one of the best implementations w=
e know of, frankly. I'm<br class=3D"">traumatised by Aptio 4 and alike, whe=
re (some issues are OEM-specific I<br class=3D"">think) you can have timer =
events signalling after ExitBS, there is event<br class=3D"">clutter on IO =
polling to the point where everything lags no matter what<br class=3D"">you=
 do, and even in "smooth" scenarios there may be nothing worth the<br class=
=
=3D"">description "granularity" (events scheduled to run every 10 ms may r=
un<br class=3D"">every 50 ms). Events are the last resort for us, if there =
really is no<br class=3D"">other way. My first GUI implementation worked wi=
thout events at all for<br class=3D"">this reason, but as our workarounds g=
ot better, we did start using them<br class=3D"">for keyboard and mouse pol=
ling.<br class=3D""><br class=3D"">Timers do not apply here, but what does =
apply is resource management.<br class=3D"">Using EFI_EVENT directly means =
(to the outside) the introduction of a<br class=3D"">new resource to mainta=
in, for each caller separately. On the other side,<br class=3D"">there is n=
o resource to misuse or leak if none such is exposed. Yet, if<br class=3D""=
>you argue with APIs like WaitForEvent, something has to signal it. In a<br=
 class=3D"">simple environment this would mean, some timer event is running=
 and may<br class=3D"">signal the event the main code waits for, where abov=
e's concern actually<br class=3D"">do apply. :) Again, the recommendation a=
ssumes the use-cases are simple<br class=3D"">enough to easily avoid them.<=
br class=3D""><br class=3D"">I think it would be best to sketch use-cases f=
or audio and design the<br class=3D"">solutions closely to the requirements=
. Why do we need to know when audio<br class=3D"">finished? What will happe=
n when we queue audio twice? There are many<br class=3D"">layers (UX, inter=
face, implementation details) of questions to coming up<br class=3D"">with =
a pleasant and stable design.<br class=3D""><br class=3D"">Best regards,<br=
 class=3D"">Marvin<br class=3D""><br class=3D""><blockquote type=3D"cite" c=
lass=3D""><br class=3D"">Thanks,<br class=3D""><br class=3D"">Andrew Fish.<=
br class=3D""><br class=3D""><blockquote type=3D"cite" class=3D"">If I reme=
mber correctly you mentioned the UEFI Talkbox before, if<br class=3D"">that=
 is more convenient for you, I'm there as mhaeuser.<br class=3D""><br class=
=
=3D"">Best regards,<br class=3D"">Marvin<br class=3D""><br class=3D""><blo=
ckquote type=3D"cite" class=3D""><br class=3D"">On 4/16/21, Andrew Fish &lt=
;<a href=3D"mailto:afish@apple.com" class=3D"">afish@apple.com</a> &lt;<a h=
ref=3D"mailto:afish@apple.com" class=3D"">mailto:afish@apple.com</a>&gt; &l=
t;<a href=3D"mailto:afish@apple.com" class=3D"">mailto:afish@apple.com</a> =
&lt;<a href=3D"mailto:afish@apple.com" class=3D"">mailto:afish@apple.com</a=
>&gt;&gt;&gt;<br class=3D"">wrote:<br class=3D""><blockquote type=3D"cite" =
class=3D""><br class=3D""><blockquote type=3D"cite" class=3D"">On Apr 16, 2=
021, at 4:34 AM, Leif Lindholm &lt;<a href=3D"mailto:leif@nuviainc.com" cla=
ss=3D"">leif@nuviainc.com</a> &lt;<a href=3D"mailto:leif@nuviainc.com" clas=
s=3D"">mailto:leif@nuviainc.com</a>&gt;<br class=3D"">&lt;<a href=3D"mailto=
:leif@nuviainc.com" class=3D"">mailto:leif@nuviainc.com</a> &lt;<a href=3D"=
mailto:leif@nuviainc.com" class=3D"">mailto:leif@nuviainc.com</a>&gt;&gt;&g=
t; wrote:<br class=3D""><br class=3D"">Hi Ethin,<br class=3D""><br class=3D=
"">I think we also want to have a SetMode function, even if we don't get<br=
 class=3D"">around to implement proper support for it as part of GSoC (alth=
ough I<br class=3D"">expect at least for virtio, that should be pretty stra=
ightforward).<br class=3D""><br class=3D""></blockquote>Leif,<br class=3D""=
><br class=3D"">I=E2=80=99m think if we have an API to load the buffer and =
a 2nd API to<br class=3D"">play the<br class=3D"">buffer an optional 3rd AP=
I could configure the streams.<br class=3D""><br class=3D""><blockquote typ=
e=3D"cite" class=3D"">It's quite likely that speech for UI would be stored =
as 8kHz (or<br class=3D"">20kHz) in some systems, whereas the example for p=
laying a tune in<br class=3D"">GRUB<br class=3D"">would more likely be a 44=
.1 kHz mp3/wav/ogg/flac.<br class=3D""><br class=3D"">For the GSoC project,=
 I think it would be quite reasonable to<br class=3D"">pre-generate pure PC=
M streams for testing rather than decoding<br class=3D"">anything on the fl=
y.<br class=3D""><br class=3D"">Porting/writing decoders is really a separa=
te task from enabling the<br class=3D"">output. I would much rather see USB=
 *and* HDA support able to play<br class=3D"">pcm<br class=3D"">streams bef=
ore worrying about decoding.<br class=3D""><br class=3D""></blockquote>I ag=
ree it might turn out it is easier to have the text to speech<br class=3D""=
>code just<br class=3D"">encode a PCM directly.<br class=3D""><br class=3D"=
">Thanks,<br class=3D""><br class=3D"">Andrew Fish<br class=3D""><br class=
=3D""><blockquote type=3D"cite" class=3D"">/<br class=3D"">&nbsp;&nbsp;&nb=
sp;Leif<br class=3D""><br class=3D"">On Fri, Apr 16, 2021 at 00:33:06 -0500=
, Ethin Probst wrote:<br class=3D""><blockquote type=3D"cite" class=3D"">Th=
anks for that explanation (I missed Mike's message). Earlier I<br class=3D"=
">sent<br class=3D"">a summary of those things that we can agree on: mainly=
, that we have<br class=3D"">mute, volume control, a load buffer, (maybe) a=
n unload buffer, and a<br class=3D"">start/stop stream function. Now that I=
 fully understand the<br class=3D"">ramifications of this I don't mind sett=
ling for a specific format<br class=3D"">and<br class=3D"">sample rate, and=
 signed 16-bit PCM audio is, I think, the most<br class=3D"">widely<br clas=
s=3D"">used one out there, besides 64-bit floating point samples, which<br =
class=3D"">I've<br class=3D"">only seen used in DAWs, and that's something =
we don't need.<br class=3D"">Are you sure you want the firmware itself to h=
andle the decoding of<br class=3D"">WAV audio? I can make a library class f=
or that, but I'll definitely<br class=3D"">need help with the security aspe=
ct.<br class=3D""><br class=3D"">On 4/16/21, Andrew Fish via <a href=3D"htt=
p://groups.io" class=3D"">groups.io</a> &lt;<a href=3D"http://groups.io" cl=
ass=3D"">http://groups.io</a>&gt; &lt;<a href=3D"http://groups.io" class=3D=
"">http://groups.io</a> &lt;<a href=3D"http://groups.io" class=3D"">http://=
groups.io</a>&gt;&gt;<br class=3D"">&lt;<a href=3D"mailto:afish=3Dapple.com=
@groups.io" class=3D"">afish=3Dapple.com@groups.io</a> &lt;<a href=3D"mailt=
o:afish=3Dapple.com@groups.io" class=3D"">mailto:afish=3Dapple.com@groups.i=
o</a>&gt; &lt;<a href=3D"mailto:afish=3Dapple.com@groups.io" class=3D"">mai=
lto:afish=3Dapple.com@groups.io</a> &lt;<a href=3D"mailto:afish=3Dapple.com=
@groups.io" class=3D"">mailto:afish=3Dapple.com@groups.io</a>&gt;&gt;&gt;<b=
r class=3D"">wrote:<br class=3D""><blockquote type=3D"cite" class=3D""><br =
class=3D""><blockquote type=3D"cite" class=3D"">On Apr 15, 2021, at 5:59 PM=
, Michael Brown &lt;<a href=3D"mailto:mcb30@ipxe.org" class=3D"">mcb30@ipxe=
.org</a> &lt;<a href=3D"mailto:mcb30@ipxe.org" class=3D"">mailto:mcb30@ipxe=
.org</a>&gt;<br class=3D"">&lt;<a href=3D"mailto:mcb30@ipxe.org" class=3D""=
>mailto:mcb30@ipxe.org</a> &lt;<a href=3D"mailto:mcb30@ipxe.org" class=3D""=
>mailto:mcb30@ipxe.org</a>&gt;&gt;&gt; wrote:<br class=3D""><br class=3D"">=
On 16/04/2021 00:42, Ethin Probst wrote:<br class=3D""><blockquote type=3D"=
cite" class=3D"">Forcing a particular channel mapping, sample rate and samp=
le<br class=3D"">format<br class=3D"">on<br class=3D"">everyone would compl=
icate application code. From an<br class=3D"">application point<br class=3D=
"">of view, one would, with that type of protocol, need to do the<br class=
=3D"">following:<br class=3D"">1) Load an audio file in any audio file for=
mat from any storage<br class=3D"">mechanism.<br class=3D"">2) Decode the a=
udio file format to extract the samples and audio<br class=3D"">metadata.<b=
r class=3D"">3) Resample the (now decoded) audio samples and convert<br cla=
ss=3D"">(quantize)<br class=3D"">the<br class=3D"">audio samples into signe=
d 16-bit PCM audio.<br class=3D"">4) forward the samples onto the EFI audio=
 protocol.<br class=3D""></blockquote>You have made an incorrect assumption=
 that there exists a<br class=3D"">requirement<br class=3D"">to<br class=3D=
"">be able to play audio files in arbitrary formats. &nbsp;This<br class=3D=
"">requirement<br class=3D"">does<br class=3D"">not exist.<br class=3D""><b=
r class=3D"">With a protocol-mandated fixed baseline set of audio parameter=
s<br class=3D"">(sample<br class=3D"">rate etc), what would happen in pract=
ice is that the audio<br class=3D"">files would<br class=3D"">be<br class=
=3D"">encoded in that format at *build* time, using tools entirely<br clas=
s=3D"">external<br class=3D"">to<br class=3D"">UEFI. &nbsp;The application =
code is then trivially simple: it just does<br class=3D"">"load<br class=3D=
"">blob, pass blob to audio protocol".<br class=3D""><br class=3D""></block=
quote><br class=3D"">Ethin,<br class=3D""><br class=3D"">Given the goal is =
an industry standard we value interoperability<br class=3D"">more<br class=
=3D"">that<br class=3D"">flexibility.<br class=3D""><br class=3D"">How abo=
ut another use case. Lets say the Linux OS loader (Grub)<br class=3D"">want=
s<br class=3D"">to<br class=3D"">have an accessible UI so it decides to sor=
e sound files on the EFI<br class=3D"">System<br class=3D"">Partition and u=
se our new fancy UEFI Audio Protocol to add audio<br class=3D"">to the<br c=
lass=3D"">OS<br class=3D"">loader GUI. So that version of Grub needs to wor=
k on 1,000 of<br class=3D"">different<br class=3D"">PCs<br class=3D"">and a=
 wide range of UEFI Audio driver implementations. It is a much<br class=3D"=
">easier<br class=3D"">world if Wave PCM 16 bit just works every place. You=
 could add a<br class=3D"">lot of<br class=3D"">complexity and try to encod=
e the audio on the fly, maybe even in<br class=3D"">Linux<br class=3D"">pro=
per but that falls down if you are booting from read only<br class=3D"">med=
ia like<br class=3D"">a<br class=3D"">DVD or backup tape (yes people still =
do that in server land).<br class=3D""><br class=3D"">The other problem wit=
h flexibility is you just made the test matrix<br class=3D"">very<br class=
=3D"">large for every driver that needs to get implemented. For<br class=
=3D"">something as<br class=3D"">complex as Intel HDA how you hook up the =
hardware and what<br class=3D"">CODECs you<br class=3D"">use<br class=3D"">=
may impact the quality of the playback for a given board. Your<br class=3D"=
">EFI is<br class=3D"">likely<br class=3D"">going to pick a single encoding=
 at that will get tested all the<br class=3D"">time if<br class=3D"">your<b=
r class=3D"">system has audio, but all 50 other things you support not so<b=
r class=3D"">much. So<br class=3D"">that<br class=3D"">will required testin=
g, and some one with audiophile ears (or an AI<br class=3D"">program)<br cl=
ass=3D"">to test all the combinations. I=E2=80=99m not kidding I get BZs on=
 the<br class=3D"">quality<br class=3D"">of<br class=3D"">the boot bong on =
our systems.<br class=3D""><br class=3D""><br class=3D""><blockquote type=
=3D"cite" class=3D""><blockquote type=3D"cite" class=3D"">typedef struct E=
FI_SIMPLE_AUDIO_PROTOCOL {<br class=3D"">&nbsp;EFI_SIMPLE_AUDIO_PROTOCOL_RE=
SET Reset;<br class=3D"">&nbsp;EFI_SIMPLE_AUDIO_PROTOCOL_START Start;<br cl=
ass=3D"">&nbsp;EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop;<br class=3D"">} EFI_SIM=
PLE_AUDIO_PROTOCOL;<br class=3D""></blockquote>This is now starting to look=
 like something that belongs in<br class=3D"">boot-time<br class=3D"">firmw=
are. &nbsp;:)<br class=3D""><br class=3D""></blockquote>I think that got a =
little too simple I=E2=80=99d go back and look at the<br class=3D"">example=
<br class=3D"">I<br class=3D"">posted to the thread but add an API to load =
the buffer, and then<br class=3D"">play<br class=3D"">the<br class=3D"">buf=
fer (that way we can an API in the future to twiddle knobs).<br class=3D"">=
That<br class=3D"">API<br class=3D"">also implements the async EFI interfac=
e. Trust me the 1st thing<br class=3D"">that is<br class=3D"">going to happ=
en when we add audio is some one is going to<br class=3D"">complain in<br c=
lass=3D"">xyz<br class=3D"">state we should mute audio, or we should honer =
audio volume and<br class=3D"">mute<br class=3D"">settings from setup, or f=
rom values set in the OS. Or some one<br class=3D"">is going<br class=3D"">=
to<br class=3D"">want the volume keys on the keyboard to work in EFI.<br cl=
ass=3D""><br class=3D"">Also if you need to pick apart the Wave PCM 16 byte=
 file to feed<br class=3D"">it into<br class=3D"">the<br class=3D"">audio h=
ardware that probably means we should have a library that<br class=3D"">doe=
s<br class=3D"">that<br class=3D"">work, so other Audio drivers can share t=
hat code. Also having a<br class=3D"">library<br class=3D"">makes it easier=
 to write a unit test. We need to be security<br class=3D"">conscious<br cl=
ass=3D"">as we<br class=3D"">need to treat the Audo file as attacker contro=
lled data.<br class=3D""><br class=3D"">Thanks,<br class=3D""><br class=3D"=
">Andrew Fish<br class=3D""><br class=3D""><blockquote type=3D"cite" class=
=3D"">Michael<br class=3D""><br class=3D""><br class=3D""><br class=3D""><=
br class=3D""><br class=3D""></blockquote><br class=3D""><br class=3D""><br=
 class=3D""><br class=3D""><br class=3D""><br class=3D""></blockquote><br c=
lass=3D"">--<span class=3D"Apple-converted-space">&nbsp;</span><br class=3D=
"">Signed,<br class=3D"">Ethin D. Probst<br class=3D""></blockquote><br cla=
ss=3D""><br class=3D""><br class=3D""><br class=3D""></blockquote><br class=
=
=3D""></blockquote><br class=3D""></blockquote><br class=3D""><br class=3D=
""><br class=3D""></blockquote></blockquote><br class=3D""><br class=3D""><=
/blockquote><br class=3D""><br class=3D"">--<span class=3D"Apple-converted-=
space">&nbsp;</span><br class=3D"">Signed,<br class=3D"">Ethin D. Probst<br=
 class=3D""></blockquote><br class=3D""></blockquote><br class=3D""></block=
quote><br style=3D"caret-color: rgb(0, 0, 0); font-family: Helvetica; font-=
size: 12px; font-style: normal; font-variant-caps: normal; font-weight: nor=
mal; letter-spacing: normal; text-align: start; text-indent: 0px; text-tran=
sform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-wi=
dth: 0px; text-decoration: none;" class=3D""><br style=3D"caret-color: rgb(=
0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font=
-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-al=
ign: start; text-indent: 0px; text-transform: none; white-space: normal; wo=
rd-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" cl=
ass=3D""><br style=3D"caret-color: rgb(0, 0, 0); font-family: Helvetica; fo=
nt-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: =
normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-t=
ransform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke=
-width: 0px; text-decoration: none;" class=3D""><span style=3D"caret-color:=
 rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal;=
 font-variant-caps: normal; font-weight: normal; letter-spacing: normal; te=
xt-align: start; text-indent: 0px; text-transform: none; white-space: norma=
l; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none=
; float: none; display: inline !important;" class=3D""></span></div></block=
quote></div><br class=3D""></body></html>

--Apple-Mail=_7356BD0C-473A-494E-9EE4-A226AB7F9F79--