dboyliao
dboyliao
Add operators to runtime: - `SubOperator`: elementwise subtraction - `MulOperator`: elementwise multiplication - `ConcatOperator`: concat tensors along given axis - `StrideSliceOperator`: strided tensor slicing, ex `t[:, ..., :3]`
Supporting scalar tensor broadcasting. ex: tensor1: shape=(50,) tensor2: shape=(1,) then broadcasting tensor2 over tensor1 in `AddOp`. That is, tensor1+tensor2 will be of shape (50,) Rationale: It's common for TensorFlow user...
Sth like this: ``` Tensor* t = new Tensor(); (t->setShape({3, 3})->setAddress()->....) // setup whatever you need. ``` It'll be a lot easier to design a template for the code generator...
Current quantization process will produce QuantizedReshape op which uTensor do not support yet Reference: https://github.com/tensorflow/tensorflow/blob/f7ec99516ce0e0937e0b865e90aa02c748cd36c6/tensorflow/core/kernels/quantized_reshape_op.cc
It's common to see softmax operation in simple multi-layer perceptron. I think we need to support such operation ASAP.
I found some reduce function, such as MaxOp or MinOp, in uTensor that behave differently than its counterpart in Tensorflow. I summarize a few in following ipython notebook: https://github.com/dboyliao/utensor_cgen/blob/develop/tests/test_cases.ipynb
I use my automatic code generation program and I found that you guys forget to check alloc of the output min or max value tensors. To be specific, let's consider...
`main.cpp`: ``` #include "linreg_ctx.hpp" #include "tensorIdxImporter.hpp" #include "uTensor_util.hpp" #include "test.hpp" #include #include #include class LinregTest : public Test { Context ctx; TensorIdxImporter t_import; public: void runAll(void); }; Serial pc(USBTX, USBRX,...
Add `cvxpy` in `setup.py`
- `uTensor::MallocAllocator`: malloc-based allocator - dynamically enlarge total capacity - `set_ram_total` - `set_meta_total`