From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from rn-mailsvcp-ppex-lapp35.apple.com (rn-mailsvcp-ppex-lapp35.apple.com [17.179.253.44]) by mx.groups.io with SMTP id smtpd.web11.1408.1618526106747051354 for ; Thu, 15 Apr 2021 15:35:06 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@apple.com header.s=20180706 header.b=ItRCbARD; spf=pass (domain: apple.com, ip: 17.179.253.44, mailfrom: afish@apple.com) Received: from pps.filterd (rn-mailsvcp-ppex-lapp35.rno.apple.com [127.0.0.1]) by rn-mailsvcp-ppex-lapp35.rno.apple.com (8.16.1.2/8.16.1.2) with SMTP id 13FMVrAG015153; Thu, 15 Apr 2021 15:35:06 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=content-type : mime-version : subject : from : in-reply-to : date : cc : content-transfer-encoding : message-id : references : to; s=20180706; bh=hYDUTsRF1cn/JtL4IfpD3GBMj4TrWU+qpj5gQR80ukw=; b=ItRCbARDl22vXgEVMn8RutGIWP11NLmjxVg8lS3UfPmqb8gzTUrLlD8sEyBNW9o5EgiM GZxKV+vIaeC/3SA7JlRMFFOk9ZdR1EzyCnoEljeTAEfHmmdz8Ev7RDu6aiPZPYSxfo+I 3kPZjRnuphmtcoaG3NI4vevoGtBAvq6+Rp2BiBQqNbN9p96p7ZV5qanxBhUQ8Nd9U7W7 u0ef7obHPzkR6Nav7ajTq/4sLdhZ6H8nHdfYBlSRsVFfZ2eJAqgKvaLCeOKy0dK/5qRg M0eiT9eLVPI7/ryDiyqJjSjdUphi6tWfJ/NKfCMNXIkmiP8UB3MIq1J2gLb5sDTV8ah4 cA== Received: from rn-mailsvcp-mta-lapp04.rno.apple.com (rn-mailsvcp-mta-lapp04.rno.apple.com [10.225.203.152]) by rn-mailsvcp-ppex-lapp35.rno.apple.com with ESMTP id 37u7j9vnne-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Thu, 15 Apr 2021 15:35:06 -0700 Received: from rn-mailsvcp-mmp-lapp01.rno.apple.com (rn-mailsvcp-mmp-lapp01.rno.apple.com [17.179.253.14]) by rn-mailsvcp-mta-lapp04.rno.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) with ESMTPS id <0QRM00DBHMQHSDD0@rn-mailsvcp-mta-lapp04.rno.apple.com>; Thu, 15 Apr 2021 15:35:05 -0700 (PDT) Received: from process_milters-daemon.rn-mailsvcp-mmp-lapp01.rno.apple.com by rn-mailsvcp-mmp-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) id <0QRM00T00MOL1A00@rn-mailsvcp-mmp-lapp01.rno.apple.com>; Thu, 15 Apr 2021 15:35:05 -0700 (PDT) X-Va-A: X-Va-T-CD: b2a6e213b1e93bac3561a11ee6934d40 X-Va-E-CD: 4730c80ee67030d4f2c83e40b4ab0357 X-Va-R-CD: 6f0325faf294bd23a6d751c620be9d51 X-Va-CD: 0 X-Va-ID: 15c96af5-8938-4691-8c83-723526e3a692 X-V-A: X-V-T-CD: b2a6e213b1e93bac3561a11ee6934d40 X-V-E-CD: 4730c80ee67030d4f2c83e40b4ab0357 X-V-R-CD: 6f0325faf294bd23a6d751c620be9d51 X-V-CD: 0 X-V-ID: 42ad291b-e334-4bc7-88f7-ebce5abf6ab9 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.761 definitions=2021-04-15_10:2021-04-15,2021-04-15 signatures=0 Received: from [17.235.19.21] (unknown [17.235.19.21]) by rn-mailsvcp-mmp-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) with ESMTPSA id <0QRM00AHDMQFO200@rn-mailsvcp-mmp-lapp01.rno.apple.com>; Thu, 15 Apr 2021 15:35:05 -0700 (PDT) MIME-version: 1.0 (Mac OS X Mail 14.0 \(3654.20.0.2.1\)) Subject: Re: [edk2-devel] VirtIO Sound Driver (GSoC 2021) From: "Andrew Fish" In-reply-to: Date: Thu, 15 Apr 2021 15:35:03 -0700 Cc: edk2-devel-groups-io , Michael Brown , Mike Kinney , Leif Lindholm , Laszlo Ersek , "Desimone, Nathaniel L" , Rafael Rodrigues Machado , Gerd Hoffmann Message-id: References: <66e073bb-366b-0559-4a78-fc5e8215aca1@redhat.com> <16758FB6436B1195.32393@groups.io> <4AEC1784-99AF-47EF-B7DD-77F91EA3D7E9@apple.com> <309cc5ca-2ecd-79dd-b183-eec0572ea982@ipxe.org> To: Ethin Probst X-Mailer: Apple Mail (2.3654.20.0.2.1) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.761 definitions=2021-04-15_10:2021-04-15,2021-04-15 signatures=0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: quoted-printable > On Apr 15, 2021, at 1:11 PM, Ethin Probst wrot= e: >=20 >> Is there any necessity for audio input and output to be implemented wit= hin the same protocol? Unlike a network device (which is intrinsically bid= irectional), it seems natural to conceptually separate audio input from aud= io output. >=20 > Nope, there isn't a necessity to make them in one, they can be > separated into two. >=20 >> The code controlling volume/mute may not have any access to the sample = buffer. The most natural implementation would seem to allow for a platform= to notice volume up/down keypresses and use those to control the overall s= ystem volume, without any knowledge of which samples (if any) are currently= being played by other code in the system. >=20 > Your assuming that the audio device your implementing the > volume/muting has volume control and muting functionality within > itself, then. Not really. We are assuming that audio hardware has a better understanding= of how that system implements volume than some generic EFI Code that is by= definition platform agnostic.=20 > This may not be the case, and so we'd need to > effectively simulate it within the driver, which isn't too hard to do. > As an example, the VirtIO driver does not have a request type for > muting or for volume control (this would, most likely, be within the > VIRTIO_SND_R_PCM_SET_PARAMS request, see sec. 5.14.6.4.3). Therefore, > either the driver would have to simulate the request or return > EFI_UNSUPPORTED this instance. >=20 So this is an example of above since the audio hardware knows it is routin= g the sound output into another subsystem, and that subsystem controls the = volume. So the VirtIo Sound Driver know best how to bstract volume/mute for= this platform. >> Consider also the point of view of the developer implementing a driver = for some other piece of audio hardware that happens not to support precisel= y the same sample rates etc as VirtIO. It would be extremely ugly to force= all future hardware to pretend to have the same capabilities as VirtIO jus= t because the API was initially designed with VirtIO in mind. >=20 > Precisely, but the brilliance of VirtIO The brilliance of VirtIO is that it just needs to implement a generic devi= ce driver for a given operating system. In most cases these operating syste= ms have sounds subsystems that manage sound and want fine granularity of co= ntrol on what is going on. So the drivers are implemented to maximizes flex= ibility since the OS has lots of generic code that deals with sound, and ev= en user configurable knobs to control audio. In our case that extra layer d= oes not exist in EFI and the end user code just want to tell the driver do = some simple things. Maybe it is easier to think about with an example. Lets say I want to play= a cool sound on every boot. What would be the workflow to make the happen.= = =20 1) Encode the sound using a standard tool into a Wave PCM 16. 2) Place the Wave file in the Firmware Volume using a given UUID as the na= me. As simple as editing the platform FDF file.=20 3) Write some BDS code a) Lookup Wave file by UUID and read it into memory.=20 b) Point the EFI Sound Protocol at the buffer with the Wave file c) Tell the EFI Sound Protocol to play the sound.=20 If you start adding in a lot of perimeters that work flow starts getting r= eally complicated really quickly.=20 > is that the sample rate, > sample format, &c., do not have to all be supported by a VirtIO > device. Notice, also, how in my protocol proposal I noted that the > sample rates, at least, were "recommended," not "required." Should a > device not happen to support a sample rate or sample format, all it > needs to do is return EFI_INVALID_PARAMETER. Section 5.14.6.2.1 > (VIRTIO_SND_R_JACK_GET_CONFIG) describes how a jack tells you what > sample rates it supports, channel mappings, &c. >=20 > I do understand how just using a predefined sample rate and sample > format might be a good idea, and if that's the best way, then that's > what we'll do. The protocol can always be revised at a later time if > necessary. I apologize if my stance seems obstinate. >=20 I think if we add the version into the protocol and make sure we have a se= parate load and play operation we could add a member to set the extra perim= eters if needed. There might also be some platform specific generic tunable= s that might be interesting for a future member function.=20 Thanks, Andrew Fish > Also, thank you, Laszlo, for your advice -- I hadn't considered that a > network driver would be another good way of figuring out how async > works in UEFI. >=20 > On 4/15/21, Andrew Fish wrote: >>=20 >>=20 >>> On Apr 15, 2021, at 5:07 AM, Michael Brown wrote: >>>=20 >>> On 15/04/2021 06:28, Ethin Probst wrote: >>>> - I hoped to add recording in case we in future want to add >>>> accessibility aids like speech recognition (that was one of the todo >>>> tasks on the EDK2 tasks list) >>>=20 >>> Is there any necessity for audio input and output to be implemented wi= thin >>> the same protocol? Unlike a network device (which is intrinsically >>> bidirectional), it seems natural to conceptually separate audio input = from >>> audio output. >>>=20 >>>> - Muting and volume control could easily be added by just replacing >>>> the sample buffer with silence and by multiplying all the samples by = a >>>> percentage. >>>=20 >>> The code controlling volume/mute may not have any access to the sample >>> buffer. The most natural implementation would seem to allow for a >>> platform to notice volume up/down keypresses and use those to control = the >>> overall system volume, without any knowledge of which samples (if any)= are >>> currently being played by other code in the system. >>>=20 >>=20 >> I=E2=80=99ve also thought of adding NVRAM variable that would let setup= , UEFI Shell, >> or even the OS set the current volume, and Mute. This how it would be >> consumed concept is why I proposed mute and volume being separate APIs.= The >> volume up/down API in addition to fixed percentage might be overkill, b= ut it >> does allow a non liner mapping to the volume up/down keys. You would be >> surprised how picky audiophiles can be and it seems they like to file >> Bugzillas. >>=20 >>>> - Finally, the reason I used enumerations for specifying parameters >>>> like sample rate and stuff was that I was looking at this protocol >>>> from a general UEFI applications point of view. VirtIO supports all o= f >>>> the sample configurations listed in my gist, and it seems reasonable >>>> to allow the application to control those parameters instead of >>>> forcing a particular parameter configuration onto the developer. >>>=20 >>> Consider also the point of view of the developer implementing a driver= for >>> some other piece of audio hardware that happens not to support precise= ly >>> the same sample rates etc as VirtIO. It would be extremely ugly to fo= rce >>> all future hardware to pretend to have the same capabilities as VirtIO >>> just because the API was initially designed with VirtIO in mind. >>>=20 >>> As a developer on the other side of the API, writing code to play soun= d >>> files on an arbitrary unknown platform, I would prefer to simply consu= me >>> as simple as possible an abstraction of an audio output protocol and n= ot >>> have to care about what hardware is actually implementing it. >>>=20 >>=20 >> It may make sense to have an API to load the buffer/stream and other AP= Is to >> play or pause. This could allow an optional API to configure how the st= ream >> is played back. If we add a version to the Protocol that would at least >> future proof us. >>=20 >> We did get feedback that it is very common to speed up the auto playbac= k >> rates for accessibility. I=E2=80=99m not sure if that is practical with= a simple PCM >> 16 wave file with the firmware audio implementation. I guess that is >> something we could investigate. >>=20 >> In terms of maybe adding text to speech there is an open source project= that >> conceptually we could port to EFI. It would likely be a binary that wou= ld >> have to live on the EFI System Partition due to size. I was thinking th= at >> words per minute could be part of that API and it would produce a PCM 1= 6 >> wave file that the audio protocol we are discussing could play. >>=20 >>> Both of these argue in favour of defining a very simple API that expre= sses >>> only a common baseline capability that is plausibly implementable for >>> every piece of audio hardware ever made. >>>=20 >>> Coupled with the relatively minimalistic requirements for boot-time au= dio, >>> I'd probably suggest supporting only a single format for audio data, w= ith >>> a fixed sample rate (and possibly only mono output). >>>=20 >>=20 >> In my world the folks that work for Jony asked for a stereo boot bong t= o >> transition from left to right :). This is not the CODEC you are looking= for >> was our response=E2=80=A6. I also did not mention that some languages a= re right to >> left, as the only thing worse than one complex thing is two complex thi= ngs >> to implement. >>=20 >>> As always: perfection is achieved, not when there is nothing more to a= dd, >>> but when there is nothing left to take away. :) >>>=20 >>=20 >> "Simplicity is the ultimate sophistication=E2=80=9D >>=20 >> Thanks, >>=20 >> Andrew Fish >>=20 >>> Thanks, >>>=20 >>> Michael >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>=20 >>=20 >=20 >=20 > --=20 > Signed, > Ethin D. Probst