feat: general backend #46

xhebox · 2023-06-30T02:10:56Z

Main changes:

Duplicated kernel_opt.h and kernel_opt.cpp into every platform (x86/arm)
Use cmake to control whether compile src/kern/optimized/${arch} or src/kern/gpu
Remove platform macros like ifdef INFER_XXX in src/kern/optimized/${arch}

This allows more flexiblity in platform implementation. I mean that every arch can implement different kernels. You don't need to write every kernel in kernel_opt.h. You can also write some kernels if you like.

For example, for rvv arch, you don't need to write vectorized cpp code for compilers. That does not quite work on rvv.

Signed-off-by: xhe <xw897002528@gmail.com>

chenqy4933 · 2023-07-03T05:05:02Z

👍

xhebox · 2023-07-03T05:17:24Z

Thanks for the invitation, so, does this PR LGTM to you? @chenqy4933

chenqy4933 · 2023-07-03T05:26:16Z

make local there are error:

/tmp/cclkOYZK.s:39948: Error: symbol `_ZZN8inferllm3opt24vec_vec_dot_q40_with_q80EiPKvS2_E19__PRETTY_FUNCTION__' is already defined
/tmp/cclkOYZK.s:39953: Error: symbol `_ZZN8inferllm3opt24vec_vec_dot_q40_with_q80EiPKvS2_E19__PRETTY_FUNCTION__' is already defined
/tmp/cclkOYZK.s:39983: Error: symbol `_ZZN8inferllm3opt19dequantize_row_q4_0EPKvPfiE19__PRETTY_FUNCTION__' is already defined

xhebox · 2023-07-03T05:28:36Z

make local there are error:

/tmp/cclkOYZK.s:39948: Error: symbol `_ZZN8inferllm3opt24vec_vec_dot_q40_with_q80EiPKvS2_E19__PRETTY_FUNCTION__' is already defined
/tmp/cclkOYZK.s:39953: Error: symbol `_ZZN8inferllm3opt24vec_vec_dot_q40_with_q80EiPKvS2_E19__PRETTY_FUNCTION__' is already defined
/tmp/cclkOYZK.s:39983: Error: symbol `_ZZN8inferllm3opt19dequantize_row_q4_0EPKvPfiE19__PRETTY_FUNCTION__' is already defined

I'll check later.

xhebox · 2023-07-03T05:38:42Z

make local there are error:

/tmp/cclkOYZK.s:39948: Error: symbol `_ZZN8inferllm3opt24vec_vec_dot_q40_with_q80EiPKvS2_E19__PRETTY_FUNCTION__' is already defined
/tmp/cclkOYZK.s:39953: Error: symbol `_ZZN8inferllm3opt24vec_vec_dot_q40_with_q80EiPKvS2_E19__PRETTY_FUNCTION__' is already defined
/tmp/cclkOYZK.s:39983: Error: symbol `_ZZN8inferllm3opt19dequantize_row_q4_0EPKvPfiE19__PRETTY_FUNCTION__' is already defined

@chenqy4933 Can you provide me some more info? For example, the output of cmake, with the following lines inserted at L61

message(STATUS "${INFER_ARCH}")
message(STATUS "${SRC}")
message(STATUS "${PLTSRC}")

chenqy4933 · 2023-07-03T08:17:18Z

it is right，I test on other x86 machine。the log is:

-- INFER_ARCH is : x86
-- SRC list: /home/cqy/project/InferLLM/src/core/device.cpp;/home/cqy/project/InferLLM/src/core/device.h;/home/cqy/project/InferLLM/src/core/graph.cpp;/home/cqy/project/InferLLM/src/core/graph.h;/home/cqy/project/InferLLM/src/core/kvstorage.h;/home/cqy/project/InferLLM/src/core/kvstotage.cpp;/home/cqy/project/InferLLM/src/core/model.cpp;/home/cqy/project/InferLLM/src/core/model_imp.cpp;/home/cqy/project/InferLLM/src/core/model_imp.h;/home/cqy/project/InferLLM/src/core/op.cpp;/home/cqy/project/InferLLM/src/core/op.h;/home/cqy/project/InferLLM/src/core/tensor.cpp;/home/cqy/project/InferLLM/src/core/tensor.h;/home/cqy/project/InferLLM/src/core/thread_pool.cpp;/home/cqy/project/InferLLM/src/core/thread_pool.h;/home/cqy/project/InferLLM/src/file.cpp;/home/cqy/project/InferLLM/src/file.h;/home/cqy/project/InferLLM/src/graph/baichuan.cpp;/home/cqy/project/InferLLM/src/graph/baichuan.h;/home/cqy/project/InferLLM/src/graph/chatGLM.cpp;/home/cqy/project/InferLLM/src/graph/chatGLM.h;/home/cqy/project/InferLLM/src/graph/graph_imp.cpp;/home/cqy/project/InferLLM/src/graph/llama.cpp;/home/cqy/project/InferLLM/src/graph/llama.h;/home/cqy/project/InferLLM/src/kern/kernel.h;/home/cqy/project/InferLLM/src/kern/kernel_define.h;/home/cqy/project/InferLLM/src/kern/naive/naive.cpp;/home/cqy/project/InferLLM/src/kern/naive/naive.h;/home/cqy/project/InferLLM/src/kern/naive/quantize.h;/home/cqy/project/InferLLM/src/utils.cpp;/home/cqy/project/InferLLM/src/utils.h
-- PLTSRC list: /home/cqy/project/InferLLM/src/kern/optimized/x86/common.h;/home/cqy/project/InferLLM/src/kern/optimized/x86/kernel.cpp;/home/cqy/project/InferLLM/src/kern/optimized/x86/kernel.h;/home/cqy/project/InferLLM/src/kern/optimized/x86/optimized.h;/home/cqy/project/InferLLM/src/kern/optimized/x86/quantize.h

by the way, there are error when enable GPU， although the GPU is not complete

xhebox · 2023-07-03T11:59:17Z

it is right，I test on other x86 machine。the log is:

Hmm, this is strange. Can you paste your full log with rm -rf build; cmake -Bbuild .; cmake --build build -v? Just paste it to https://pastebin.com/ or somewhere else. I can't reproduce.

chenqy4933 · 2023-07-04T01:56:48Z

I will debug it in my machine， LGTM

feat: general backend

3bfeb79

Signed-off-by: xhe <xw897002528@gmail.com>

xhebox merged commit ce1f576 into MegEngine:main Jul 4, 2023

xhebox deleted the dd branch July 4, 2023 03:05

xhebox mentioned this pull request Jul 21, 2023

unsupported relocation 37 on musl libc #58

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: general backend #46

feat: general backend #46

feat: general backend #46

feat: general backend #46

Conversation