LPCNet icon indicating copy to clipboard operation
LPCNet copied to clipboard

LPCNET Python code for inference with Tacotron2 feature

Open alokprasad opened this issue 5 years ago • 9 comments

Synthesis the waveform using LPCNet You should use LPCNet of the C code version and make sure that the LPCNet is builded with TACOTRON2 Macro.

@MlWoo what changes will required for test_lpcnet.py to make it work with features from tacotron2 ?

alokprasad avatar Mar 31 '20 06:03 alokprasad

@alokprasad good questions. The other params(18 dims) of input features is computed by C code, however, it does not be calculated by python code. Maybe I should borrow it from C in python with cython or something else or write a script in weekends.

MlWoo avatar Mar 31 '20 23:03 MlWoo

@MlWoo will wait for the changes , meanwhile will try myself .

alokprasad avatar Apr 01 '20 06:04 alokprasad

@alokprasad I has implement the code. You can view the branch cmake. Hope you can find bugs and we can fix together. And I will close the issue in a week.

MlWoo avatar Apr 12 '20 09:04 MlWoo

@MlWoo Thanks for the patch , will try and update you.

alokprasad avatar Apr 13 '20 04:04 alokprasad

Hi, is there any update about this issue?

I tried building taco2+lpcnet but facing this issue in cmake

CMakeFiles/test_vec.dir/library/src/test_vec.c.o: In function test_sgemv_accum16': /home/stuart/sagar/speech_analysis_synth/lpctron_cmake/LPCNet/library/src/test_vec.c:72: undefined reference to sgemv_accum16_fast' CMakeFiles/test_vec.dir/library/src/test_vec.c.o: In function test_sparse_sgemv_accum16': /home/stuart/sagar/speech_analysis_synth/lpctron_cmake/LPCNet/library/src/test_vec.c:106: undefined reference to sparse_sgemv_accum16_fast'

raikarsagar avatar Nov 24 '20 15:11 raikarsagar

@raikarsagar Sorry to forget to push the cmake files to the repo. plz update it and run it again

MlWoo avatar Nov 25 '20 06:11 MlWoo

Thanks for the update. The inference time is too slow with test_lpcnet.py compared to c version. There shouldn't be much performance degradation as we are calling the native c function for lpc_from_cepstrum(). Am I correct? or is there any further change to be made?

raikarsagar avatar Nov 25 '20 08:11 raikarsagar

@raikarsagar it is not correct. Calling the native c function for lpc_from_cepstrum() will NOT lead to performance degradation. But test_lpcnet.py uses tf implementation to infer and it cost much time.

MlWoo avatar Nov 25 '20 08:11 MlWoo

I share the code that I made to use the c code from python.

Add this lines to MakeFile

test_lpcnet.so: $(test_lpcnet_objs)
    gcc -shared -o $@ $(CFLAGS) $(test_lpcnet_objs) -lm

(Add test_lpcnet.so to line 6)

all: dump_data test_lpcnet test_vec test_lpcnet.so

Generate shared library

make test_lpcnet.so taco=1

Create a python file to consume shared library

from ctypes import *
import os
import numpy as np
import time

my_path = os.path.abspath(os.path.dirname(__file__))

lpcnet_lib = CDLL(f"{my_path}/LPCNet/test_lpcnet.so")

FEATURE_CONV1_STATE_SIZE = 102*2
FEATURE_CONV2_STATE_SIZE = 128*2
GRU_A_STATE_SIZE = 384
GRU_B_STATE_SIZE = 16

class NNetState(Structure):
    _fields_ = [
        ("feature_conv1_state", c_float*FEATURE_CONV1_STATE_SIZE),
        ("feature_conv2_state", c_float*FEATURE_CONV2_STATE_SIZE),
        ("gru_a_state", c_float*GRU_A_STATE_SIZE),
        ("gru_b_state", c_float*GRU_B_STATE_SIZE)]

LPC_ORDER = 16
FEATURE_CONV1_DELAY = 1
FEATURE_CONV2_DELAY = 1
FEATURES_DELAY = FEATURE_CONV1_DELAY + FEATURE_CONV2_DELAY
FEATURE_CONV2_OUT_SIZE = 102

class LPCNetState(Structure):
    _fields_ = [
        ("nnet", NNetState),
        ("last_exc", c_int),
        ("last_sig", c_float*LPC_ORDER),
        ("old_input", c_float*FEATURES_DELAY*FEATURE_CONV2_OUT_SIZE),
        ("old_lpc", c_float*FEATURES_DELAY*LPC_ORDER),
        ("old_gain", c_float*FEATURES_DELAY),
        ("frame_count", c_int),
        ("deemph_mem", c_float)]


lpcnet_create = lpcnet_lib.lpcnet_create
lpcnet_create.argtypes = []
lpcnet_create.restype = POINTER(LPCNetState)

lpcnet_synthesize = lpcnet_lib.lpcnet_synthesize
lpcnet_synthesize.argtypes = [POINTER(LPCNetState), POINTER(c_short), POINTER(c_float), c_int]
lpcnet_synthesize.restype = c_void_p

NB_BANDS = 18
FRAME_SIZE = 160
NB_FEATURES = 38
NB_TOTAL_FEATURES = 55

net = lpcnet_create()

pcm_type = c_short * FRAME_SIZE
pcm = pcm_type()

def synthesize(all_features, fout):
    # all_features = np.fromfile(feature_file, dtype='float32')

    all_features_re = np.reshape(all_features, (-1, NB_BANDS+2))
    feature_type = c_float * NB_FEATURES

    start = time.time()
    output = []
    start = time.time()
    for f in all_features_re:
        feature = np.zeros(NB_FEATURES, float)
        feature[:18] = f[:18]
        feature[36:] = f[18:]
        
        # feature_p = feature.ctypes.data_as(POINTER(c_float))
        feature_p = feature_type(*feature)

        lpcnet_synthesize(net, pcm, feature_p, FRAME_SIZE)
        
        pcm_numpy = np.ctypeslib.as_array(pcm, shape=(1,FRAME_SIZE))
        # np.array(output, dtype='int16').tofile(fout)
        # output.append(np.array(pcm_numpy, copy=True))
        pcm_numpy.tofile(fout)

if __name__ == "__main__":
    fout = open("result/test2.s16", 'wb')
    start = time.time()
    features_file = [f"result/feature-{i}.f32" for i in range(35)]
    for f in features_file:
        synthesize(f, fout)
    print(time.time()-start)

The time are similar to execute c code.

dmelli avatar Mar 11 '21 14:03 dmelli