[SOLVED] Machine-Learning- HW2: Phoneme Classification

35.00 $

Category:

Description

5/5 - (2 votes)

Task:  Multiclass Classification        M   M  M  AH AH SH SH IH  IH  IH   N   N   N   N …

Framewise phoneme prediction from speech.

What is a phoneme?

A unit of speech sound in a language that can serve to distinguish one word from the other.

  • bat / pat , bad / bed
  • Machine Learning → M AH SH IH N L ER N IH NG

Data Preprocessing

Acoustic Features – MFCCs (Mel Frequency Cepstral Coefficients)

 

shape (11,39)                                                                                                                                                                                                                      label

More Information About the Data

Since each frame only contains 25 ms of speech, a single frame is       prev frames    future frames unlikely to represent a complete phoneme

  • Usually, a phoneme will span several frames flatten  reshape to (11,39)
    • Hint: post-processing may help
  • Concatenate the neighboring phonemes for training
    • In this HW, we concatenate the past and the future five frames for training (total 11 frames)

○     You may reshape the input (1,429) back to (11,39) to get separated 11 frames

○     Just remember that the label corresponds to the center frame

  • Finding testing labels or doing human labeling are strictly prohibited!

Introduction to Digital Speech Processing

Dataset & Data Format

  • Dataset: TIMIT Acoustic-Phonetic Continuous Speech Corpus

○         Phonetically balanced for English

  • Data Format (The TAs have already preprocessed the data) timit_11/
    • npy training data (# of training frames, 11 x feature dim)
    • npy framewise phoneme label (0-38)
    • npy testing data (# of testing frames, 11 x feature dim) ● Acoustic features (39-dim MFCC)

○         Concatenate the past and the future five frames (feature dim = 11 x 39)

○            The phoneme label of each input corresponds to the center frame

  • Using additional data is prohibited. Your final grade will be multiplied by 0.9!

 

 Class Phoneme Example  Class Phoneme Example  Class Phoneme Example
 0 iy beet  13 l lay  26 dx muddy
 1 ih bit  14 r ray  27 g gay
 2 eh bet  15 y yacht  28 p pea
 3 ae bat  16 w way  29 t tea
 4 ah but  17 er bird  30 k key
 5 uw boot  18 m mom  31 z zone
 6 uh book  19 n noon  32 v van
 7 aa bob  20 ng sing  33 f fin
 8 ey bait  21 ch choke  34 th thin
 9 ay bite  22 jh joke  35 s sea
 10 oy boy  23 dh then  36 sh she
 11 aw bout  24 b bee  37 hh hay
 12 ow boat  25 d day  38 sil silence/closure sounds

Sample Code

Colab Link:

https://colab.research.google.com/github/ga642381/ML2021-Spring/blob/main/HW 02/HW02-1.ipynb ●      Simple baseline

○           You should able to pass the simple baseline using the sample code provided.

  • Strong baseline

○          Model architecture (layers? dimension? activation function?)

○          Training (batch size? optimizer? learning rate? epoch?)

○         Tips (batch norm? dropout? regularization?)

2  Hessian Matrix

Task Introduction

Task:  Hessian Matrix

Imagine we are training a neural network, and we try to find out whether the model reaches a local minima-like point, saddle point, or none of the above. We can make our decision by calculating the Hessian matrix. What is Hessian?

Hessian is the second order partial derivatives of a model. It is highly recommended to watch the lecture video before starting this part.

Task Introduction

The target function in this task is a one-variable sinc function.

You will get

  • a model checkpoint trained by TA, ● a batch of training data, ●      a loss function.

You will calculate the Hessian matrix and make the decision accordingly.

Gradient Norm / Minimum Ratio

1.  Gradient Norm

In a normal training process, we rarely have gradients equal to zero. In this homework, we regard those gradient norm less than 1e-3 as zero.

2.  Minimum Ratio

For an ideal local minima, all the eigenvalues of the hessian matrix are greater than zero. We define the proportion of positive eigenvalues as minimum ratio.

In this homework, if minimum ratio is greater than 0.5 and gradient norm is less than 1e-3, then we assume that the model is at “local minima like”.

Gradient Norm / Minimal Ratio

In this homework, we assume that

  • gradient norm < 1e-3 and minimum ratio > 0.5 => local minima like, ● gradient norm < 1e-3 and minimum ratio <= 0.5 => saddle point, ● gradient norm >= 1e-3 => none of the above.

Important Notice

  • You don’t need to and shouldn’t change any part of the code.
  • You can only use colab to run the code. Otherwise, your result might differ due to environmental issue.
  • You will get a different checkpoint according to your student ID, so please make sure to fill in your student ID in the sample code correctly.

Sample Code

Colab Link:

https://colab.research.google.com/github/ga642381/ML2021-Spring/blob/main/HW

02/HW02-2.ipynb

  • After executing the sample code, you should get a result like this.
  • Notice that each student will get a different answer, so your answer may differ from the example.

Choose your answer from local minima like, saddle point, or none of the above