On Apr 16, 2021, at 10:45 AM, Ethin Probst <harlydavidsen@gmail.com> wrote:

Yes, three APIs (maybe like this) would work well:
- Start, Stop: begin playback of a stream
- SetVolume, GetVolume, Mute, Unmute: control volume of output and enable muting
- CreateStream, ReleaseStream, SetStreamSampleRate: Control sample
rate of stream (but not sample format since Signed 16-bit PCM is
enough)
Marvin, how do you suggest we make the events then? We need some way
of notifying the caller that the stream has concluded. We could make
the driver create the event and pass it back to the caller as an
event, but you'd still have dangling pointers (this is C, after all).
We could just make a IsPlaying() function and WaitForCompletion()
function and allow the driver to do the event handling -- would that
work?


Ethin,

I can adapt my example from earlier in the thread for how to do async EFI APIs. 

/**
  The struct of Audio Token.
**/
typedef struct {

  ///
  /// Event will be signaled when the read audo has been played.
  /// If only synchronous playback is supported the Event must still get signaled. 
  ///
  EFI_EVENT               Event;

  ///
  /// Defines whether or not the signaled event encountered an error.
  ///
  EFI_STATUS              TransactionStatus;
} CODE_FIRST_HDA_AUDIO_TOKEN;

/**
  Play synchronous or asynchronous audio. If Token is passed in the audio is played
  asynchronously, if Token is NULL then this call blocks until the Audio is complete. 

  @param[in]      This         A pointer to the CODE_FIRST_HDA_AUDIO_PROTOCOL instance.
  @param[in, out] Token        A pointer to the token associated with the transaction.
                               If NULL this functions blocks until audio play back is complete.

  @retval EFI_SUCCESS           The audio started playing. 
  @retval EFI_OUT_OF_RESOURCES  The request could not be completed due to a lack of resources.
  @retval EFI_INVALID_PARAMETER One or more parameters are invalid.

**/
typedef
EFI_STATUS
(EFIAPI *CODE_FIRST_HDA_AUDIO_PROTOCOL_PLAY_STREAM)(
  IN     CODE_FIRST_HDA_AUDIO_PROTOCOL   *This,
  IN OUT CODE_FIRST_HDA_AUDIO_TOKEN      *Token     OPTIONAL
  );


You can fake out the asynchronous interface in your synchronous flow by ….


if (Token != NULL) {
  Token->TransactionStatus = Status;  // return value from sync call
  gBS->SignalEvent (Token->Event);
}

This just gives the caller the experience that the event was signaled before the call returned, which is legal for an async API. 

The caller allocates a CODE_FIRST_HDA_AUDIO_TOKEN and creates an Event that has an associated callback function. Then calls gAudio->PlayStream (gAudio, Token); and the callers callback function fires when the audio stream completes. 

Thanks,

Andrew Fish

On 4/16/21, Andrew Fish <afish@apple.com> wrote:


On Apr 16, 2021, at 4:34 AM, Leif Lindholm <leif@nuviainc.com> wrote:

Hi Ethin,

I think we also want to have a SetMode function, even if we don't get
around to implement proper support for it as part of GSoC (although I
expect at least for virtio, that should be pretty straightforward).


Leif,

I’m think if we have an API to load the buffer and a 2nd API to play the
buffer an optional 3rd API could configure the streams.

It's quite likely that speech for UI would be stored as 8kHz (or
20kHz) in some systems, whereas the example for playing a tune in GRUB
would more likely be a 44.1 kHz mp3/wav/ogg/flac.

For the GSoC project, I think it would be quite reasonable to
pre-generate pure PCM streams for testing rather than decoding
anything on the fly.

Porting/writing decoders is really a separate task from enabling the
output. I would much rather see USB *and* HDA support able to play pcm
streams before worrying about decoding.


I agree it might turn out it is easier to have the text to speech code just
encode a PCM directly.

Thanks,

Andrew Fish

/
  Leif

On Fri, Apr 16, 2021 at 00:33:06 -0500, Ethin Probst wrote:
Thanks for that explanation (I missed Mike's message). Earlier I sent
a summary of those things that we can agree on: mainly, that we have
mute, volume control, a load buffer, (maybe) an unload buffer, and a
start/stop stream function. Now that I fully understand the
ramifications of this I don't mind settling for a specific format and
sample rate, and signed 16-bit PCM audio is, I think, the most widely
used one out there, besides 64-bit floating point samples, which I've
only seen used in DAWs, and that's something we don't need.
Are you sure you want the firmware itself to handle the decoding of
WAV audio? I can make a library class for that, but I'll definitely
need help with the security aspect.

On 4/16/21, Andrew Fish via groups.io <afish=apple.com@groups.io> wrote:


On Apr 15, 2021, at 5:59 PM, Michael Brown <mcb30@ipxe.org> wrote:

On 16/04/2021 00:42, Ethin Probst wrote:
Forcing a particular channel mapping, sample rate and sample format
on
everyone would complicate application code. From an application point
of view, one would, with that type of protocol, need to do the
following:
1) Load an audio file in any audio file format from any storage
mechanism.
2) Decode the audio file format to extract the samples and audio
metadata.
3) Resample the (now decoded) audio samples and convert (quantize)
the
audio samples into signed 16-bit PCM audio.
4) forward the samples onto the EFI audio protocol.

You have made an incorrect assumption that there exists a requirement
to
be able to play audio files in arbitrary formats.  This requirement
does
not exist.

With a protocol-mandated fixed baseline set of audio parameters
(sample
rate etc), what would happen in practice is that the audio files would
be
encoded in that format at *build* time, using tools entirely external
to
UEFI.  The application code is then trivially simple: it just does
"load
blob, pass blob to audio protocol".



Ethin,

Given the goal is an industry standard we value interoperability more
that
flexibility.

How about another use case. Lets say the Linux OS loader (Grub) wants
to
have an accessible UI so it decides to sore sound files on the EFI
System
Partition and use our new fancy UEFI Audio Protocol to add audio to the
OS
loader GUI. So that version of Grub needs to work on 1,000 of different
PCs
and a wide range of UEFI Audio driver implementations. It is a much
easier
world if Wave PCM 16 bit just works every place. You could add a lot of
complexity and try to encode the audio on the fly, maybe even in Linux
proper but that falls down if you are booting from read only media like
a
DVD or backup tape (yes people still do that in server land).

The other problem with flexibility is you just made the test matrix
very
large for every driver that needs to get implemented. For something as
complex as Intel HDA how you hook up the hardware and what CODECs you
use
may impact the quality of the playback for a given board. Your EFI is
likely
going to pick a single encoding at that will get tested all the time if
your
system has audio, but all 50 other things you support not so much. So
that
will required testing, and some one with audiophile ears (or an AI
program)
to test all the combinations. I’m not kidding I get BZs on the quality
of
the boot bong on our systems.


typedef struct EFI_SIMPLE_AUDIO_PROTOCOL {
EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset;
EFI_SIMPLE_AUDIO_PROTOCOL_START Start;
EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop;
} EFI_SIMPLE_AUDIO_PROTOCOL;

This is now starting to look like something that belongs in boot-time
firmware.  :)


I think that got a little too simple I’d go back and look at the example
I
posted to the thread but add an API to load the buffer, and then play
the
buffer (that way we can an API in the future to twiddle knobs). That
API
also implements the async EFI interface. Trust me the 1st thing that is
going to happen when we add audio is some one is going to complain in
xyz
state we should mute audio, or we should honer audio volume and mute
settings from setup, or from values set in the OS. Or some one is going
to
want the volume keys on the keyboard to work in EFI.

Also if you need to pick apart the Wave PCM 16 byte file to feed it into
the
audio hardware that probably means we should have a library that does
that
work, so other Audio drivers can share that code. Also having a library
makes it easier to write a unit test. We need to be security conscious
as we
need to treat the Audo file as attacker controlled data.

Thanks,

Andrew Fish

Michael














--
Signed,
Ethin D. Probst









--
Signed,
Ethin D. Probst