All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] Hexagon (target/hexagon) New architecture support
@ 2023-04-26  2:30 Taylor Simpson
  2023-04-26  2:30 ` [PATCH 1/9] Hexagon (target/hexagon) Add support for v68/v69/v71/v73 Taylor Simpson
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Taylor Simpson @ 2023-04-26  2:30 UTC (permalink / raw
  To: qemu-devel
  Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
	quic_mathbern

Add support for new Hexagon architecture versions v68/v69/v71/v73


Taylor Simpson (9):
  Hexagon (target/hexagon) Add support for v68/v69/v71/v73
  Hexagon (target/hexagon) Add v68 scalar instructions
  Hexagon (tests/tcg/hexagon) Add v68 scalar tests
  Hexagon (target/hexagon) Add v68 HVX instructions
  Hexagon (tests/tcg/hexagon) Add v68 HVX tests
  Hexagon (target/hexagon) Add v69 HVX instructions
  Hexagon (tests/tcg/hexagon) Add v69 HVX tests
  Hexagon (target/hexagon) Add v73 scalar instructions
  Hexagon (tests/tcg/hexagon) Add v73 scalar tests

 configure                                    |   2 +-
 linux-user/hexagon/target_elf.h              |  13 +-
 target/hexagon/cpu.h                         |   4 +
 target/hexagon/gen_tcg.h                     |  22 ++
 target/hexagon/gen_tcg_hvx.h                 |  12 +
 target/hexagon/mmvec/macros.h                |   9 +-
 tests/tcg/hexagon/v6mpy_ref.h                | 161 ++++++++++
 target/hexagon/attribs_def.h.inc             |  16 +
 target/hexagon/cpu.c                         |  20 ++
 target/hexagon/translate.c                   |   3 +
 tests/tcg/hexagon/misc.c                     |  12 +
 tests/tcg/hexagon/v68_hvx.c                  |  90 ++++++
 tests/tcg/hexagon/v68_scalar.c               | 186 +++++++++++
 tests/tcg/hexagon/v69_hvx.c                  | 318 ++++++++++++++++++
 tests/tcg/hexagon/v73_scalar.c               |  96 ++++++
 target/hexagon/gen_idef_parser_funcs.py      |   2 +
 target/hexagon/imported/branch.idef          |   7 +-
 target/hexagon/imported/encode_pp.def        |  21 +-
 target/hexagon/imported/ldst.idef            |  20 +-
 target/hexagon/imported/mmvec/encode_ext.def |  16 +-
 target/hexagon/imported/mmvec/ext.idef       | 321 ++++++++++++++++++-
 tests/tcg/hexagon/Makefile.target            |  13 +
 22 files changed, 1349 insertions(+), 15 deletions(-)
 create mode 100644 tests/tcg/hexagon/v6mpy_ref.h
 create mode 100644 tests/tcg/hexagon/v68_hvx.c
 create mode 100644 tests/tcg/hexagon/v68_scalar.c
 create mode 100644 tests/tcg/hexagon/v69_hvx.c
 create mode 100644 tests/tcg/hexagon/v73_scalar.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/9] Hexagon (target/hexagon) Add support for v68/v69/v71/v73
  2023-04-26  2:30 [PATCH 0/9] Hexagon (target/hexagon) New architecture support Taylor Simpson
@ 2023-04-26  2:30 ` Taylor Simpson
  2023-04-26 18:06   ` Anton Johansson via
  2023-04-26  2:30 ` [PATCH 2/9] Hexagon (target/hexagon) Add v68 scalar instructions Taylor Simpson
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Taylor Simpson @ 2023-04-26  2:30 UTC (permalink / raw
  To: qemu-devel
  Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
	quic_mathbern

Add support for the ELF flags
Move target/hexagon/cpu.[ch] to be v73
Change the compiler flag used by "make check-tcg"

The decbin instruction is removed in Hexagon v73, so check the
version before trying to compile the instruction.

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 configure                         |  2 +-
 linux-user/hexagon/target_elf.h   | 13 +++++++++----
 target/hexagon/cpu.h              |  4 ++++
 target/hexagon/cpu.c              | 20 ++++++++++++++++++++
 tests/tcg/hexagon/misc.c          | 12 ++++++++++++
 tests/tcg/hexagon/Makefile.target |  3 +++
 6 files changed, 49 insertions(+), 5 deletions(-)

diff --git a/configure b/configure
index 77c03315f8..01fa77f6c7 100755
--- a/configure
+++ b/configure
@@ -1857,7 +1857,7 @@ fi
 : ${cross_cc_armeb="$cross_cc_arm"}
 : ${cross_cc_cflags_armeb="-mbig-endian"}
 : ${cross_cc_hexagon="hexagon-unknown-linux-musl-clang"}
-: ${cross_cc_cflags_hexagon="-mv67 -O2 -static"}
+: ${cross_cc_cflags_hexagon="-mv73 -O2 -static"}
 : ${cross_cc_cflags_i386="-m32"}
 : ${cross_cc_cflags_ppc="-m32 -mbig-endian"}
 : ${cross_cc_cflags_ppc64="-m64 -mbig-endian"}
diff --git a/linux-user/hexagon/target_elf.h b/linux-user/hexagon/target_elf.h
index b4e9f40527..a0271a0a2a 100644
--- a/linux-user/hexagon/target_elf.h
+++ b/linux-user/hexagon/target_elf.h
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -20,7 +20,7 @@
 
 static inline const char *cpu_get_model(uint32_t eflags)
 {
-    /* For now, treat anything newer than v5 as a v67 */
+    /* For now, treat anything newer than v5 as a v73 */
     /* FIXME - Disable instructions that are newer than the specified arch */
     if (eflags == 0x04 ||    /* v5  */
         eflags == 0x05 ||    /* v55 */
@@ -30,9 +30,14 @@ static inline const char *cpu_get_model(uint32_t eflags)
         eflags == 0x65 ||    /* v65 */
         eflags == 0x66 ||    /* v66 */
         eflags == 0x67 ||    /* v67 */
-        eflags == 0x8067     /* v67t */
+        eflags == 0x8067 ||  /* v67t */
+        eflags == 0x68 ||    /* v68 */
+        eflags == 0x69 ||    /* v69 */
+        eflags == 0x71 ||    /* v71 */
+        eflags == 0x8071 ||  /* v71t */
+        eflags == 0x73       /* v73 */
        ) {
-        return "v67";
+        return "v73";
     }
     return "unknown";
 }
diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index 81b663ecfb..4d8981d862 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -43,6 +43,10 @@
 #define CPU_RESOLVING_TYPE TYPE_HEXAGON_CPU
 
 #define TYPE_HEXAGON_CPU_V67 HEXAGON_CPU_TYPE_NAME("v67")
+#define TYPE_HEXAGON_CPU_V68 HEXAGON_CPU_TYPE_NAME("v68")
+#define TYPE_HEXAGON_CPU_V69 HEXAGON_CPU_TYPE_NAME("v69")
+#define TYPE_HEXAGON_CPU_V71 HEXAGON_CPU_TYPE_NAME("v71")
+#define TYPE_HEXAGON_CPU_V73 HEXAGON_CPU_TYPE_NAME("v73")
 
 #define MMU_USER_IDX 0
 
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
index ab40cfc283..8699db8c24 100644
--- a/target/hexagon/cpu.c
+++ b/target/hexagon/cpu.c
@@ -29,6 +29,22 @@ static void hexagon_v67_cpu_init(Object *obj)
 {
 }
 
+static void hexagon_v68_cpu_init(Object *obj)
+{
+}
+
+static void hexagon_v69_cpu_init(Object *obj)
+{
+}
+
+static void hexagon_v71_cpu_init(Object *obj)
+{
+}
+
+static void hexagon_v73_cpu_init(Object *obj)
+{
+}
+
 static ObjectClass *hexagon_cpu_class_by_name(const char *cpu_model)
 {
     ObjectClass *oc;
@@ -382,6 +398,10 @@ static const TypeInfo hexagon_cpu_type_infos[] = {
         .class_init = hexagon_cpu_class_init,
     },
     DEFINE_CPU(TYPE_HEXAGON_CPU_V67,              hexagon_v67_cpu_init),
+    DEFINE_CPU(TYPE_HEXAGON_CPU_V68,              hexagon_v68_cpu_init),
+    DEFINE_CPU(TYPE_HEXAGON_CPU_V69,              hexagon_v69_cpu_init),
+    DEFINE_CPU(TYPE_HEXAGON_CPU_V71,              hexagon_v71_cpu_init),
+    DEFINE_CPU(TYPE_HEXAGON_CPU_V73,              hexagon_v73_cpu_init),
 };
 
 DEFINE_TYPES(hexagon_cpu_type_infos)
diff --git a/tests/tcg/hexagon/misc.c b/tests/tcg/hexagon/misc.c
index e126751e3a..4fcbb22795 100644
--- a/tests/tcg/hexagon/misc.c
+++ b/tests/tcg/hexagon/misc.c
@@ -18,6 +18,8 @@
 #include <stdio.h>
 #include <string.h>
 
+#define CORE_HAS_CABAC            (__HEXAGON_ARCH__ <= 71)
+
 typedef unsigned char uint8_t;
 typedef unsigned short uint16_t;
 typedef unsigned int uint32_t;
@@ -245,6 +247,7 @@ static void check(int val, int expect)
     }
 }
 
+#if CORE_HAS_CABAC
 static void check64(long long val, long long expect)
 {
     if (val != expect) {
@@ -252,6 +255,7 @@ static void check64(long long val, long long expect)
         err++;
     }
 }
+#endif
 
 uint32_t init[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
 uint32_t array[10];
@@ -286,6 +290,7 @@ static long long creg_pair(int x, int y)
     return retval;
 }
 
+#if CORE_HAS_CABAC
 static long long decbin(long long x, long long y, int *pred)
 {
     long long retval;
@@ -295,6 +300,7 @@ static long long decbin(long long x, long long y, int *pred)
          : "r"(x), "r"(y));
     return retval;
 }
+#endif
 
 /* Check that predicates are auto-and'ed in a packet */
 static int auto_and(void)
@@ -388,8 +394,10 @@ void test_count_trailing_zeros_ones(void)
 int main()
 {
     int res;
+#if CORE_HAS_CABAC
     long long res64;
     int pred;
+#endif
 
     memcpy(array, init, sizeof(array));
     S4_storerhnew_rr(array, 4, 0xffff);
@@ -505,6 +513,7 @@ int main()
     res = test_clrtnew(2, 7);
     check(res, 7);
 
+#if CORE_HAS_CABAC
     res64 = decbin(0xf0f1f2f3f4f5f6f7LL, 0x7f6f5f4f3f2f1f0fLL, &pred);
     check64(res64, 0x357980003700010cLL);
     check(pred, 0);
@@ -512,6 +521,9 @@ int main()
     res64 = decbin(0xfLL, 0x1bLL, &pred);
     check64(res64, 0x78000100LL);
     check(pred, 1);
+#else
+    puts("Skipping cabac tests");
+#endif
 
     res = auto_and();
     check(res, 0);
diff --git a/tests/tcg/hexagon/Makefile.target b/tests/tcg/hexagon/Makefile.target
index 7c94db4bc4..59b1b074e9 100644
--- a/tests/tcg/hexagon/Makefile.target
+++ b/tests/tcg/hexagon/Makefile.target
@@ -82,6 +82,9 @@ TESTS += $(HEX_TESTS)
 usr: usr.c
 	$(CC) $(CFLAGS) -mv67t -O2 -Wno-inline-asm -Wno-expansion-to-defined $< -o $@ $(LDFLAGS)
 
+# Build this test with -mv71 to exercise the CABAC instruction
+misc: misc.c
+	$(CC) $(CFLAGS) -mv71 -O2 $< -o $@ $(LDFLAGS)
 scatter_gather: CFLAGS += -mhvx
 vector_add_int: CFLAGS += -mhvx -fvectorize
 hvx_misc: hvx_misc.c hvx_misc.h
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/9] Hexagon (target/hexagon) Add v68 scalar instructions
  2023-04-26  2:30 [PATCH 0/9] Hexagon (target/hexagon) New architecture support Taylor Simpson
  2023-04-26  2:30 ` [PATCH 1/9] Hexagon (target/hexagon) Add support for v68/v69/v71/v73 Taylor Simpson
@ 2023-04-26  2:30 ` Taylor Simpson
  2023-04-27 10:36   ` Anton Johansson via
  2023-04-26  2:30 ` [PATCH 3/9] Hexagon (tests/tcg/hexagon) Add v68 scalar tests Taylor Simpson
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Taylor Simpson @ 2023-04-26  2:30 UTC (permalink / raw
  To: qemu-devel
  Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
	quic_mathbern

The following instructions are added
    L2_loadw_aq
    L4_loadd_aq
    R6_release_at_vi
    R6_release_st_vi
    S2_storew_rl_at_vi
    S4_stored_rl_at_vi
    S2_storew_rl_st_vi
    S4_stored_rl_st_vi

The release instructions are nop's in qemu.  The others behave as
 loads/stores.

The encodings for these instructions changed some "don't care" bits
    L2_loadw_locked
    L4_loadd_locked
    S2_storew_locked
    S4_stored_locked

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/gen_tcg.h                | 18 ++++++++++++++++++
 target/hexagon/attribs_def.h.inc        |  7 +++++++
 target/hexagon/translate.c              |  3 +++
 target/hexagon/gen_idef_parser_funcs.py |  2 ++
 target/hexagon/imported/encode_pp.def   | 19 ++++++++++++++-----
 target/hexagon/imported/ldst.idef       | 20 +++++++++++++++++++-
 6 files changed, 63 insertions(+), 6 deletions(-)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 329e7a1024..598d80d3ce 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -1236,6 +1236,24 @@
         uiV = uiV; \
     } while (0)
 
+#define fGEN_TCG_L2_loadw_aq(SHORTCODE)                 SHORTCODE
+#define fGEN_TCG_L4_loadd_aq(SHORTCODE)                 SHORTCODE
+
+/* Nothing to do for these in qemu, need to suppress compiler warnings */
+#define fGEN_TCG_R6_release_at_vi(SHORTCODE) \
+    do { \
+        RsV = RsV; \
+    } while (0)
+#define fGEN_TCG_R6_release_st_vi(SHORTCODE) \
+    do { \
+        RsV = RsV; \
+    } while (0)
+
+#define fGEN_TCG_S2_storew_rl_at_vi(SHORTCODE)          SHORTCODE
+#define fGEN_TCG_S4_stored_rl_at_vi(SHORTCODE)          SHORTCODE
+#define fGEN_TCG_S2_storew_rl_st_vi(SHORTCODE)          SHORTCODE
+#define fGEN_TCG_S4_stored_rl_st_vi(SHORTCODE)          SHORTCODE
+
 #define fGEN_TCG_J2_trap0(SHORTCODE) \
     do { \
         uiV = uiV; \
diff --git a/target/hexagon/attribs_def.h.inc b/target/hexagon/attribs_def.h.inc
index 9874d1658f..0ddfb45bdf 100644
--- a/target/hexagon/attribs_def.h.inc
+++ b/target/hexagon/attribs_def.h.inc
@@ -52,6 +52,12 @@ DEF_ATTRIB(REGWRSIZE_4B, "Memory width is 4 bytes", "", "")
 DEF_ATTRIB(REGWRSIZE_8B, "Memory width is 8 bytes", "", "")
 DEF_ATTRIB(MEMLIKE, "Memory-like instruction", "", "")
 DEF_ATTRIB(MEMLIKE_PACKET_RULES, "follows Memory-like packet rules", "", "")
+DEF_ATTRIB(RELEASE, "Releases a lock", "", "")
+DEF_ATTRIB(ACQUIRE, "Acquires a lock", "", "")
+
+DEF_ATTRIB(RLS_INNER, "Store release inner visibility", "", "")
+DEF_ATTRIB(RLS_ALL_THREAD, "Store release among all threads", "", "")
+DEF_ATTRIB(RLS_SAME_THREAD, "Store release with the same thread", "", "")
 
 /* V6 Vector attributes */
 DEF_ATTRIB(CVI, "Executes on the HVX extension", "", "")
@@ -74,6 +80,7 @@ DEF_ATTRIB(CVI_SCATTER_RELEASE, "CVI Store Release for scatter", "", "")
 DEF_ATTRIB(CVI_TMP_DST, "CVI instruction that doesn't write a register", "", "")
 DEF_ATTRIB(CVI_SLOT23, "Can execute in slot 2 or slot 3 (HVX)", "", "")
 
+DEF_ATTRIB(VTCM_ALLBANK_ACCESS, "Allocates in all VTCM schedulers.", "", "")
 
 /* Change-of-flow attributes */
 DEF_ATTRIB(JUMP, "Jump-type instruction", "", "")
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index c087f183d0..5308d05447 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -481,6 +481,9 @@ static void mark_store_width(DisasContext *ctx)
     uint8_t width = 0;
 
     if (GET_ATTRIB(opcode, A_SCALAR_STORE)) {
+        if (GET_ATTRIB(opcode, A_MEMSIZE_0B)) {
+            return;
+        }
         if (GET_ATTRIB(opcode, A_MEMSIZE_1B)) {
             width |= 1;
         }
diff --git a/target/hexagon/gen_idef_parser_funcs.py b/target/hexagon/gen_idef_parser_funcs.py
index afe68bdb6f..dc9e396b52 100644
--- a/target/hexagon/gen_idef_parser_funcs.py
+++ b/target/hexagon/gen_idef_parser_funcs.py
@@ -109,6 +109,8 @@ def main():
                 continue
             if "A_COF" in hex_common.attribdict[tag]:
                 continue
+            if ( tag.startswith('R6_release_') ):
+                continue
 
             regs = tagregs[tag]
             imms = tagimms[tag]
diff --git a/target/hexagon/imported/encode_pp.def b/target/hexagon/imported/encode_pp.def
index d71c04cd30..763f465bfd 100644
--- a/target/hexagon/imported/encode_pp.def
+++ b/target/hexagon/imported/encode_pp.def
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -382,14 +382,23 @@ DEF_ENC32(L4_return_fnew_pt,  ICLASS_LD" 011 0 000 sssss PP1110vv ---ddddd")
 DEF_ENC32(L4_return_tnew_pnt, ICLASS_LD" 011 0 000 sssss PP0010vv ---ddddd")
 DEF_ENC32(L4_return_fnew_pnt, ICLASS_LD" 011 0 000 sssss PP1010vv ---ddddd")
 
-DEF_ENC32(L2_loadw_locked,ICLASS_LD" 001 0 000 sssss PP00---- -00ddddd")
+DEF_ENC32(L2_loadw_locked,ICLASS_LD" 001 0 000 sssss PP000--- 000ddddd")
 
 
 
+DEF_ENC32(L2_loadw_aq,        ICLASS_LD" 001 0 000 sssss PP001--- 000ddddd")
+DEF_ENC32(L4_loadd_aq,        ICLASS_LD" 001 0 000 sssss PP011--- 000ddddd")
 
+DEF_ENC32(R6_release_at_vi,    ICLASS_ST" 000 01 11sssss PP0ttttt --0011dd")
+DEF_ENC32(R6_release_st_vi,   ICLASS_ST" 000 01 11sssss PP0ttttt --1011dd")
 
+DEF_ENC32(S2_storew_rl_at_vi,  ICLASS_ST" 000 01 01sssss PP-ttttt --0010dd")
+DEF_ENC32(S2_storew_rl_st_vi, ICLASS_ST" 000 01 01sssss PP-ttttt --1010dd")
 
-DEF_ENC32(L4_loadd_locked,ICLASS_LD" 001 0 000 sssss PP01---- -00ddddd")
+DEF_ENC32(S4_stored_rl_at_vi,  ICLASS_ST" 000 01 11sssss PP0ttttt --0010dd")
+DEF_ENC32(S4_stored_rl_st_vi, ICLASS_ST" 000 01 11sssss PP0ttttt --1010dd")
+
+DEF_ENC32(L4_loadd_locked,ICLASS_LD" 001 0 000 sssss PP010--- 000ddddd")
 DEF_EXT_SPACE(EXTRACTW,   ICLASS_LD" 001 0 000 iiiii PP0iiiii -01iiiii")
 DEF_ENC32(Y2_dcfetchbo,   ICLASS_LD" 010 0 000 sssss PP0--iii iiiiiiii")
 
@@ -479,8 +488,8 @@ STD_PST_ENC(rinew, "1 101","10ttt")
 /*                               x bus/cache     */
 /*                                    x store/cache     */
 DEF_ENC32(S2_allocframe,   ICLASS_ST" 000 01 00xxxxx PP000iii iiiiiiii")
-DEF_ENC32(S2_storew_locked,ICLASS_ST" 000 01 01sssss PP-ttttt ------dd")
-DEF_ENC32(S4_stored_locked,ICLASS_ST" 000 01 11sssss PP0ttttt ------dd")
+DEF_ENC32(S2_storew_locked,ICLASS_ST" 000 01 01sssss PP-ttttt ----00dd")
+DEF_ENC32(S4_stored_locked,ICLASS_ST" 000 01 11sssss PP0ttttt ----00dd")
 DEF_ENC32(Y2_dczeroa,      ICLASS_ST" 000 01 10sssss PP0----- --------")
 
 
diff --git a/target/hexagon/imported/ldst.idef b/target/hexagon/imported/ldst.idef
index 237634bdd9..53198176a9 100644
--- a/target/hexagon/imported/ldst.idef
+++ b/target/hexagon/imported/ldst.idef
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -128,6 +128,24 @@ Q6INSN(S2_allocframe,"allocframe(Rx32,#u11:3):raw", ATTRIBS(A_REGWRSIZE_8B,A_MEM
 
 #define A_RETURN A_RESTRICT_COF_MAX1,A_RESTRICT_SLOT0ONLY,A_RESTRICT_NOSLOT1_STORE,A_RET_TYPE,A_DEALLOCRET
 
+/**** Load Acquire Store Release Instructions****/
+
+
+
+Q6INSN(L2_loadw_aq,"Rd32=memw_aq(Rs32)",ATTRIBS(A_REGWRSIZE_4B,A_ACQUIRE,A_RESTRICT_SLOT0ONLY,A_MEMSIZE_4B,A_LOAD),"Load Acquire Word",
+{ fEA_REG(RsV); fLOAD(1,4,u,EA,RdV); })
+Q6INSN(L4_loadd_aq,"Rdd32=memd_aq(Rs32)",ATTRIBS(A_REGWRSIZE_8B,A_ACQUIRE,A_RESTRICT_SLOT0ONLY,A_MEMSIZE_8B,A_LOAD),"Load Acquire Double integer",
+{ fEA_REG(RsV); fLOAD(1,8,u,EA,RddV); })
+
+Q6INSN(R6_release_at_vi,"release(Rs32):at",ATTRIBS(A_MEMSIZE_0B,A_RELEASE,A_STORE,A_VTCM_ALLBANK_ACCESS,A_RLS_INNER,A_RLS_ALL_THREAD,A_RESTRICT_NOPACKET,A_RESTRICT_SLOT0ONLY), "Release lock", {fEA_REG(RsV); fSTORE(1,0,EA,RsV); })
+Q6INSN(R6_release_st_vi,"release(Rs32):st",ATTRIBS(A_MEMSIZE_0B,A_RELEASE,A_STORE,A_VTCM_ALLBANK_ACCESS,A_RLS_INNER,A_RLS_SAME_THREAD,A_RESTRICT_NOPACKET,A_RESTRICT_SLOT0ONLY), "Release lock", {fEA_REG(RsV); fSTORE(1,0,EA,RsV); })
+
+Q6INSN(S2_storew_rl_at_vi,"memw_rl(Rs32):at=Rt32",ATTRIBS(A_REGWRSIZE_4B,A_RELEASE,A_VTCM_ALLBANK_ACCESS,A_RLS_INNER,A_RLS_ALL_THREAD,A_RESTRICT_NOPACKET,A_MEMSIZE_4B,A_STORE,A_RESTRICT_SLOT0ONLY),"Store Release Word", { fEA_REG(RsV); fSTORE(1,4,EA,RtV); })
+Q6INSN(S4_stored_rl_at_vi,"memd_rl(Rs32):at=Rtt32",ATTRIBS(A_REGWRSIZE_8B,A_RELEASE,A_VTCM_ALLBANK_ACCESS,A_RLS_INNER,A_RLS_ALL_THREAD,A_RESTRICT_NOPACKET,A_MEMSIZE_8B,A_STORE,A_RESTRICT_SLOT0ONLY),"Store Release Double integer", { fEA_REG(RsV); fSTORE(1,8,EA,RttV); })
+
+Q6INSN(S2_storew_rl_st_vi,"memw_rl(Rs32):st=Rt32",ATTRIBS(A_REGWRSIZE_4B,A_RELEASE,A_VTCM_ALLBANK_ACCESS,A_RLS_INNER,A_RLS_SAME_THREAD,A_RESTRICT_NOPACKET,A_MEMSIZE_4B,A_STORE,A_RESTRICT_SLOT0ONLY),"Store Release Word", { fEA_REG(RsV); fSTORE(1,4,EA,RtV); })
+Q6INSN(S4_stored_rl_st_vi,"memd_rl(Rs32):st=Rtt32",ATTRIBS(A_REGWRSIZE_8B,A_RELEASE,A_VTCM_ALLBANK_ACCESS,A_RLS_INNER,A_RLS_SAME_THREAD,A_RESTRICT_NOPACKET,A_MEMSIZE_8B,A_STORE,A_RESTRICT_SLOT0ONLY),"Store Release Double integer", { fEA_REG(RsV); fSTORE(1,8,EA,RttV); })
+
 Q6INSN(L2_deallocframe,"Rdd32=deallocframe(Rs32):raw", ATTRIBS(A_REGWRSIZE_8B,A_MEMSIZE_8B,A_LOAD,A_DEALLOCFRAME), "Deallocate stack frame",
 { fHIDE(size8u_t tmp;) fEA_REG(RsV);
   fLOAD(1,8,u,EA,tmp);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/9] Hexagon (tests/tcg/hexagon) Add v68 scalar tests
  2023-04-26  2:30 [PATCH 0/9] Hexagon (target/hexagon) New architecture support Taylor Simpson
  2023-04-26  2:30 ` [PATCH 1/9] Hexagon (target/hexagon) Add support for v68/v69/v71/v73 Taylor Simpson
  2023-04-26  2:30 ` [PATCH 2/9] Hexagon (target/hexagon) Add v68 scalar instructions Taylor Simpson
@ 2023-04-26  2:30 ` Taylor Simpson
  2023-04-27 13:34   ` Anton Johansson via
  2023-04-26  2:30 ` [PATCH 4/9] Hexagon (target/hexagon) Add v68 HVX instructions Taylor Simpson
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Taylor Simpson @ 2023-04-26  2:30 UTC (permalink / raw
  To: qemu-devel
  Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
	quic_mathbern

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 tests/tcg/hexagon/v68_scalar.c    | 186 ++++++++++++++++++++++++++++++
 tests/tcg/hexagon/Makefile.target |   2 +
 2 files changed, 188 insertions(+)
 create mode 100644 tests/tcg/hexagon/v68_scalar.c

diff --git a/tests/tcg/hexagon/v68_scalar.c b/tests/tcg/hexagon/v68_scalar.c
new file mode 100644
index 0000000000..7a8adb1130
--- /dev/null
+++ b/tests/tcg/hexagon/v68_scalar.c
@@ -0,0 +1,186 @@
+/*
+ *  Copyright(c) 2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+#include <stdbool.h>
+#include <stdint.h>
+
+/*
+ *  Test the scalar core instructions that are new in v68
+ */
+
+int err;
+
+static int buffer32[] = { 1, 2, 3, 4 };
+static long long buffer64[] = { 5, 6, 7, 8 };
+
+static void __check32(int line, uint32_t result, uint32_t expect)
+{
+    if (result != expect) {
+        printf("ERROR at line %d: 0x%08x != 0x%08x\n",
+               line, result, expect);
+        err++;
+    }
+}
+
+#define check32(RES, EXP) __check32(__LINE__, RES, EXP)
+
+static void __check64(int line, uint64_t result, uint64_t expect)
+{
+    if (result != expect) {
+        printf("ERROR at line %d: 0x%016llx != 0x%016llx\n",
+               line, result, expect);
+        err++;
+    }
+}
+
+#define check64(RES, EXP) __check64(__LINE__, RES, EXP)
+
+static inline int loadw_aq(int *p)
+{
+    int res;
+    asm volatile("%0 = memw_aq(%1)\n\t"
+                 : "=r"(res) : "r"(p));
+    return res;
+}
+
+static void test_loadw_aq(void)
+{
+    int res;
+
+    res = loadw_aq(&buffer32[0]);
+    check32(res, 1);
+    res = loadw_aq(&buffer32[1]);
+    check32(res, 2);
+}
+
+static inline long long loadd_aq(long long *p)
+{
+    long long res;
+    asm volatile("%0 = memd_aq(%1)\n\t"
+                 : "=r"(res) : "r"(p));
+    return res;
+}
+
+static void test_loadd_aq(void)
+{
+    long long res;
+
+    res = loadd_aq(&buffer64[2]);
+    check64(res, 7);
+    res = loadd_aq(&buffer64[3]);
+    check64(res, 8);
+}
+
+static inline void release_at(int *p)
+{
+    asm volatile("release(%0):at\n\t"
+                 : : "r"(p));
+}
+
+static void test_release_at(void)
+{
+    release_at(&buffer32[2]);
+    check64(buffer32[2], 3);
+    release_at(&buffer32[3]);
+    check64(buffer32[3], 4);
+}
+
+static inline void release_st(int *p)
+{
+    asm volatile("release(%0):st\n\t"
+                 : : "r"(p));
+}
+
+static void test_release_st(void)
+{
+    release_st(&buffer32[2]);
+    check64(buffer32[2], 3);
+    release_st(&buffer32[3]);
+    check64(buffer32[3], 4);
+}
+
+static inline void storew_rl_at(int *p, int val)
+{
+    asm volatile("memw_rl(%0):at = %1\n\t"
+                 : : "r"(p), "r"(val) : "memory");
+}
+
+static void test_storew_rl_at(void)
+{
+    storew_rl_at(&buffer32[2], 9);
+    check64(buffer32[2], 9);
+    storew_rl_at(&buffer32[3], 10);
+    check64(buffer32[3], 10);
+}
+
+static inline void stored_rl_at(long long *p, long long val)
+{
+    asm volatile("memd_rl(%0):at = %1\n\t"
+                 : : "r"(p), "r"(val) : "memory");
+}
+
+static void test_stored_rl_at(void)
+{
+    stored_rl_at(&buffer64[2], 11);
+    check64(buffer64[2], 11);
+    stored_rl_at(&buffer64[3], 12);
+    check64(buffer64[3], 12);
+}
+
+static inline void storew_rl_st(int *p, int val)
+{
+    asm volatile("memw_rl(%0):st = %1\n\t"
+                 : : "r"(p), "r"(val) : "memory");
+}
+
+static void test_storew_rl_st(void)
+{
+    storew_rl_st(&buffer32[0], 13);
+    check64(buffer32[0], 13);
+    storew_rl_st(&buffer32[1], 14);
+    check64(buffer32[1], 14);
+}
+
+static inline void stored_rl_st(long long *p, long long val)
+{
+    asm volatile("memd_rl(%0):st = %1\n\t"
+                 : : "r"(p), "r"(val) : "memory");
+}
+
+static void test_stored_rl_st(void)
+{
+    stored_rl_st(&buffer64[0], 15);
+    check64(buffer64[0], 15);
+    stored_rl_st(&buffer64[1], 15);
+    check64(buffer64[1], 15);
+}
+
+int main()
+{
+    test_loadw_aq();
+    test_loadd_aq();
+    test_release_at();
+    test_release_st();
+    test_storew_rl_at();
+    test_stored_rl_at();
+    test_storew_rl_st();
+    test_stored_rl_st();
+
+    puts(err ? "FAIL" : "PASS");
+    return err ? 1 : 0;
+}
diff --git a/tests/tcg/hexagon/Makefile.target b/tests/tcg/hexagon/Makefile.target
index 59b1b074e9..b7529e23bc 100644
--- a/tests/tcg/hexagon/Makefile.target
+++ b/tests/tcg/hexagon/Makefile.target
@@ -76,6 +76,8 @@ HEX_TESTS += test_vminh
 HEX_TESTS += test_vpmpyh
 HEX_TESTS += test_vspliceb
 
+HEX_TESTS += v68_scalar
+
 TESTS += $(HEX_TESTS)
 
 # This test has to be compiled for the -mv67t target
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/9] Hexagon (target/hexagon) Add v68 HVX instructions
  2023-04-26  2:30 [PATCH 0/9] Hexagon (target/hexagon) New architecture support Taylor Simpson
                   ` (2 preceding siblings ...)
  2023-04-26  2:30 ` [PATCH 3/9] Hexagon (tests/tcg/hexagon) Add v68 scalar tests Taylor Simpson
@ 2023-04-26  2:30 ` Taylor Simpson
  2023-04-27 13:36   ` Anton Johansson via
  2023-04-26  2:30 ` [PATCH 5/9] Hexagon (tests/tcg/hexagon) Add v68 HVX tests Taylor Simpson
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Taylor Simpson @ 2023-04-26  2:30 UTC (permalink / raw
  To: qemu-devel
  Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
	quic_mathbern

The following instructions are added
    V6_v6mpyvubs10_vxx
    V6_v6mpyhubs10_vxx
    V6_v6mpyvubs10
    V6_v6mpyhubs10

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/mmvec/macros.h                |   9 +-
 target/hexagon/imported/mmvec/encode_ext.def |   8 +-
 target/hexagon/imported/mmvec/ext.idef       | 281 ++++++++++++++++++-
 3 files changed, 295 insertions(+), 3 deletions(-)

diff --git a/target/hexagon/mmvec/macros.h b/target/hexagon/mmvec/macros.h
index 1201d778d0..a655634fd1 100644
--- a/target/hexagon/mmvec/macros.h
+++ b/target/hexagon/mmvec/macros.h
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -346,4 +346,11 @@
 #define fUARCH_NOTE_PUMP_2X()
 
 #define IV1DEAD()
+
+#define fGET10BIT(COE, VAL, POS) \
+    do { \
+        COE = (sextract32(VAL, 24 + 2 * POS, 2) << 8) | \
+               extract32(VAL, POS * 8, 8); \
+    } while (0);
+
 #endif
diff --git a/target/hexagon/imported/mmvec/encode_ext.def b/target/hexagon/imported/mmvec/encode_ext.def
index 6fbbe2c422..b9b62fef8d 100644
--- a/target/hexagon/imported/mmvec/encode_ext.def
+++ b/target/hexagon/imported/mmvec/encode_ext.def
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -730,6 +730,8 @@ DEF_ENC(V6_vmaxb,         ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 101 ddddd") //
 DEF_ENC(V6_vsatuwuh,    ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 110 ddddd") //
 DEF_ENC(V6_vdealb4w,     ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 111 ddddd") //
 
+DEF_ENC(V6_v6mpyvubs10_vxx, 	ICLASS_CJ" 1 111 001 vvvvv PP 1 uuuuu 0ii xxxxx")
+DEF_ENC(V6_v6mpyhubs10_vxx, 	ICLASS_CJ" 1 111 001 vvvvv PP 1 uuuuu 1ii xxxxx")
 
 DEF_ENC(V6_vmpyowh_rnd,     ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 000 ddddd") //
 DEF_ENC(V6_vshuffeb,      ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 001 ddddd") //
@@ -740,6 +742,10 @@ DEF_ENC(V6_vshufoeh,      ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 101 ddddd") //
 DEF_ENC(V6_vshufoeb,      ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 110 ddddd") //
 DEF_ENC(V6_vcombine,     ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 111 ddddd") //
 
+DEF_ENC(V6_v6mpyvubs10,  ICLASS_CJ" 1 111 010 vvvvv PP 1 uuuuu 0ii ddddd")
+DEF_ENC(V6_v6mpyhubs10,  ICLASS_CJ" 1 111 010 vvvvv PP 1 uuuuu 1ii ddddd")
+
+
 DEF_ENC(V6_vmpyieoh,     ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 000 ddddd") //
 DEF_ENC(V6_vadduwsat,     ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 001 ddddd") //
 DEF_ENC(V6_vsathub,     ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 010 ddddd") //
diff --git a/target/hexagon/imported/mmvec/ext.idef b/target/hexagon/imported/mmvec/ext.idef
index 8ca5a606e1..c0d169fd4f 100644
--- a/target/hexagon/imported/mmvec/ext.idef
+++ b/target/hexagon/imported/mmvec/ext.idef
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -116,6 +116,10 @@ ITERATOR_INSN_MPY_SLOT_LATE(WIDTH,TAG, SYNTAX2,DESCR,CODE)
 EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX_DV),  \
 DESCR, DO_FOR_EACH_CODE(WIDTH, CODE))
 
+#define ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC_VX_FWD(WIDTH,TAG,SYNTAX,DESCR,CODE) \
+EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX_DV),  \
+DESCR, DO_FOR_EACH_CODE(WIDTH, CODE))
+
 #define ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,CODE) \
 ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX2,DESCR,CODE)
 
@@ -2507,6 +2511,281 @@ EXTINSN(V6_vscattermhw , "vscatter(Rt32,Mu2,Vvv32.w).h=Vw32", ATTRIBS(A_EXTENSIO
 })
 
 
+ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC_VX_FWD(32, v6mpyvubs10_vxx, "Vxx32.w+=v6mpy(Vuu32.ub,Vvv32.b,#u2):v", "",
+    fHIDE(size2s_t c00;)
+    fGET10BIT(c00, VvvV.v[0].uw[i], 0)
+    fHIDE(size2s_t c01;)
+    fGET10BIT(c01, VvvV.v[0].uw[i], 1)
+    fHIDE(size2s_t c02;)
+    fGET10BIT(c02, VvvV.v[0].uw[i], 2)
+
+	fHIDE(size2s_t c10;)
+    fGET10BIT(c10, VvvV.v[1].uw[i], 0)
+    fHIDE(size2s_t c11;)
+    fGET10BIT(c11, VvvV.v[1].uw[i], 1)
+    fHIDE(size2s_t c12;)
+    fGET10BIT(c12, VvvV.v[1].uw[i], 2)
+
+    if (uiV == 0) {
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c10);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c11);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[1].uw[i]), c12);
+
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c00);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c01);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c02);
+
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c10);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c11);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c12);
+
+    } else if (uiV == 1) {
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c00);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c01);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[1].uw[i]), c02);
+
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c10);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c11);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[1].uw[i]), c12);
+
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c00);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c01);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c02);
+
+    } else if (uiV == 2) {
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c10);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c11);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c12);
+
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[0].uw[i]), c00);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c01);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c02);
+
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[0].uw[i]), c10);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c11);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c12);
+
+    } else if (uiV == 3) {
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c00);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c01);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c02);
+
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c10);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c11);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c12);
+
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[0].uw[i]), c00);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c01);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c02);
+    }
+)
+ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC_VX_FWD(32, v6mpyhubs10_vxx, "Vxx32.w+=v6mpy(Vuu32.ub,Vvv32.b,#u2):h", "",
+    fHIDE(size2s_t c00;)
+    fGET10BIT(c00, VvvV.v[0].uw[i], 0)
+    fHIDE(size2s_t c01;)
+    fGET10BIT(c01, VvvV.v[0].uw[i], 1)
+    fHIDE(size2s_t c02;)
+    fGET10BIT(c02, VvvV.v[0].uw[i], 2)
+    fHIDE(size2s_t c10;)
+    fGET10BIT(c10, VvvV.v[1].uw[i], 0)
+    fHIDE(size2s_t c11;)
+    fGET10BIT(c11, VvvV.v[1].uw[i], 1)
+    fHIDE(size2s_t c12;)
+    fGET10BIT(c12, VvvV.v[1].uw[i], 2)
+
+    if (uiV == 0) {
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[1].uw[i]), c10);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c11);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c12);
+
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c00);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c01);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c02);
+
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c10);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c11);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c12);
+
+    } else if (uiV == 1) {
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[1].uw[i]), c00);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c01);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c02);
+
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[1].uw[i]), c10);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c11);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c12);
+
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c00);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c01);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c02);
+
+    }  else if (uiV == 2) {
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c10);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c11);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c12);
+
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c00);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c01);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[0].uw[i]), c02);
+
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c10);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c11);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[0].uw[i]), c12);
+
+    } else if (uiV == 3) {
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c00);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c01);
+        VxxV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c02);
+
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c10);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c11);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c12);
+
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c00);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c01);
+        VxxV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[0].uw[i]), c02);
+    }
+)
+
+
+ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC(32, v6mpyvubs10, "Vdd32.w=v6mpy(Vuu32.ub,Vvv32.b,#u2):v", "",
+    fHIDE(short c00;)
+    fGET10BIT(c00, VvvV.v[0].uw[i], 0)
+    fHIDE(short c01;)
+    fGET10BIT(c01, VvvV.v[0].uw[i], 1)
+    fHIDE(short c02;)
+    fGET10BIT(c02, VvvV.v[0].uw[i], 2)
+    fHIDE(short c10;)
+    fGET10BIT(c10, VvvV.v[1].uw[i], 0)
+    fHIDE(short c11;)
+    fGET10BIT(c11, VvvV.v[1].uw[i], 1)
+    fHIDE(short c12;)
+    fGET10BIT(c12, VvvV.v[1].uw[i], 2)
+
+
+
+    if (uiV == 0) {
+        VddV.v[1].w[i]  = fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c10);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c11);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[1].uw[i]), c12);
+
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c00);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c01);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c02);
+
+        VddV.v[0].w[i]  = fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c10);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c11);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c12);
+
+    }  else if (uiV == 1) {
+        VddV.v[1].w[i]  = fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c00);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c01);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[1].uw[i]), c02);
+
+        VddV.v[0].w[i]  = fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c10);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c11);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[1].uw[i]), c12);
+
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c00);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c01);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c02);
+
+    }  else if (uiV == 2) {
+        VddV.v[1].w[i]  = fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c10);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c11);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c12);
+
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[0].uw[i]), c00);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c01);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c02);
+
+        VddV.v[0].w[i]  = fMPY16US(fGETUBYTE(0,VuuV.v[0].uw[i]), c10);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c11);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c12);
+
+    } else if (uiV == 3) {
+        VddV.v[1].w[i]  = fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c00);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c01);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c02);
+
+        VddV.v[0].w[i]  = fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c10);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c11);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c12);
+
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[0].uw[i]), c00);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c01);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c02);
+    }
+)
+
+ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC(32, v6mpyhubs10, "Vdd32.w=v6mpy(Vuu32.ub,Vvv32.b,#u2):h", "",
+    fHIDE(short c00;)
+    fGET10BIT(c00, VvvV.v[0].uw[i], 0)
+    fHIDE(short c01;)
+    fGET10BIT(c01, VvvV.v[0].uw[i], 1)
+    fHIDE(short c02;)
+    fGET10BIT(c02, VvvV.v[0].uw[i], 2)
+    fHIDE(short c10;)
+    fGET10BIT(c10, VvvV.v[1].uw[i], 0)
+    fHIDE(short c11;)
+    fGET10BIT(c11, VvvV.v[1].uw[i], 1)
+    fHIDE(short c12;)
+    fGET10BIT(c12, VvvV.v[1].uw[i], 2)
+
+    if (uiV == 0) {
+        VddV.v[1].w[i]  = fMPY16US(fGETUBYTE(3,VuuV.v[1].uw[i]), c10);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c11);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c12);
+
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c00);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c01);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c02);
+
+        VddV.v[0].w[i]  = fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c10);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c11);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c12);
+
+    }  else if (uiV == 1) {
+        VddV.v[1].w[i]  = fMPY16US(fGETUBYTE(3,VuuV.v[1].uw[i]), c00);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c01);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c02);
+
+        VddV.v[0].w[i]  = fMPY16US(fGETUBYTE(3,VuuV.v[1].uw[i]), c10);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c11);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c12);
+
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[1].uw[i]), c00);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c01);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c02);
+
+    }  else if (uiV == 2) {
+        VddV.v[1].w[i]  = fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c10);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c11);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c12);
+
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c00);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c01);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[0].uw[i]), c02);
+
+        VddV.v[0].w[i]  = fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c10);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c11);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[0].uw[i]), c12);
+
+    } else if (uiV == 3) {
+        VddV.v[1].w[i]  = fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c00);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c01);
+        VddV.v[1].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c02);
+
+        VddV.v[0].w[i]  = fMPY16US(fGETUBYTE(1,VuuV.v[1].uw[i]), c10);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(3,VuuV.v[0].uw[i]), c11);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(1,VuuV.v[0].uw[i]), c12);
+
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[1].uw[i]), c00);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(2,VuuV.v[0].uw[i]), c01);
+        VddV.v[0].w[i] += fMPY16US(fGETUBYTE(0,VuuV.v[0].uw[i]), c02);
+    }
+)
+
 
 EXTINSN(V6_vscattermhwq,  "if (Qs4) vscatter(Rt32,Mu2,Vvv32.w).h=Vw32", ATTRIBS(A_EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA_DV,A_CVI_VM,A_MEMLIKE), "Scatter halfwords conditional",
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/9] Hexagon (tests/tcg/hexagon) Add v68 HVX tests
  2023-04-26  2:30 [PATCH 0/9] Hexagon (target/hexagon) New architecture support Taylor Simpson
                   ` (3 preceding siblings ...)
  2023-04-26  2:30 ` [PATCH 4/9] Hexagon (target/hexagon) Add v68 HVX instructions Taylor Simpson
@ 2023-04-26  2:30 ` Taylor Simpson
  2023-04-27 13:43   ` Anton Johansson via
  2023-04-26  2:30 ` [PATCH 6/9] Hexagon (target/hexagon) Add v69 HVX instructions Taylor Simpson
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Taylor Simpson @ 2023-04-26  2:30 UTC (permalink / raw
  To: qemu-devel
  Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
	quic_mathbern

---
 tests/tcg/hexagon/v6mpy_ref.h     | 161 ++++++++++++++++++++++++++++++
 tests/tcg/hexagon/v68_hvx.c       |  90 +++++++++++++++++
 tests/tcg/hexagon/Makefile.target |   3 +
 3 files changed, 254 insertions(+)
 create mode 100644 tests/tcg/hexagon/v6mpy_ref.h
 create mode 100644 tests/tcg/hexagon/v68_hvx.c

diff --git a/tests/tcg/hexagon/v6mpy_ref.h b/tests/tcg/hexagon/v6mpy_ref.h
new file mode 100644
index 0000000000..8258cddcb1
--- /dev/null
+++ b/tests/tcg/hexagon/v6mpy_ref.h
@@ -0,0 +1,161 @@
+/*
+ *  Copyright(c) 2021-2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+{ 0xffffee11, 0xfffffcca, 0xffffc1b3, 0xffffd0cc,
+  0xffffe215, 0xfffff58e, 0xffffaf37, 0xffffc310,
+  0xffffd919, 0xfffff152, 0xffff9fbb, 0xffffb854,
+  0xffffd31d, 0xfffff016, 0xffff933f, 0xffffb098,
+  0xffffd021, 0xfffff1da, 0xffff89c3, 0xffffabdc,
+  0xffffd025, 0xfffff69e, 0xffff8347, 0xffffaa20,
+  0xffffd329, 0xfffffe62, 0xffff7fcb, 0xffffab64,
+  0xffffd92d, 0x00000926, 0xffff7f4f, 0xffffafa8,
+  },
+{ 0xffffe231, 0x000016ea, 0xffff81d3, 0xffffb6ec,
+  0xffffee35, 0x000027ae, 0xffff8757, 0xffffc130,
+  0xfffffd39, 0x00003b72, 0xffff8fdb, 0xffffce74,
+  0x00000f3d, 0x00005236, 0xffff9b5f, 0xffffdeb8,
+  0x00002441, 0x00006bfa, 0xffffa9e3, 0xfffff1fc,
+  0x00003c45, 0x000088be, 0xffffbb67, 0x00000840,
+  0x00005749, 0x0000a882, 0xffffcfeb, 0xffffe684,
+  0x0000494d, 0x00009a46, 0xffffb16f, 0x000002c8,
+  },
+{ 0xfffff351, 0x0000440a, 0xffff4af3, 0xffff9c0c,
+  0xffffef55, 0x000044ce, 0xffff4077, 0xffff9650,
+  0xffffee59, 0x00004892, 0xffff38fb, 0xffff9394,
+  0xfffff05d, 0x00004f56, 0xffff347f, 0xffff93d8,
+  0xfffff561, 0x0000591a, 0xffff3303, 0xffff971c,
+  0xfffffd65, 0x000065de, 0xffff3487, 0xffff9d60,
+  0x00000869, 0x000075a2, 0xffff390b, 0xffffa6a4,
+  0x0000166d, 0x00008866, 0xffff408f, 0xffffb2e8,
+  },
+{ 0x00002771, 0x00009e2a, 0xffff4b13, 0xffffc22c,
+  0x00003b75, 0x0000b6ee, 0xffff5897, 0xffffd470,
+  0x00005279, 0x0000d2b2, 0xffff691b, 0xffffe9b4,
+  0x00006c7d, 0x0000f176, 0xffff7c9f, 0x000001f8,
+  0x00008981, 0x0001133a, 0xffff9323, 0x00001d3c,
+  0x0000a985, 0x000137fe, 0xffffaca7, 0x00003b80,
+  0x0000cc89, 0x00015fc2, 0xffffc92b, 0xffffe1c4,
+  0x0000868d, 0x00011986, 0xffff72af, 0x00000608,
+  },
+{ 0xfffff891, 0x00008b4a, 0xfffed433, 0xffff674c,
+  0xfffffc95, 0x0000940e, 0xfffed1b7, 0xffff6990,
+  0x00000399, 0x00009fd2, 0xfffed23b, 0xffff6ed4,
+  0x00000d9d, 0x0000ae96, 0xfffed5bf, 0xffff7718,
+  0x00001aa1, 0x0000c05a, 0xfffedc43, 0xffff825c,
+  0x00002aa5, 0x0000d51e, 0xfffee5c7, 0xffff90a0,
+  0x00003da9, 0x0000ece2, 0xfffef24b, 0xffffa1e4,
+  0x000053ad, 0x000107a6, 0xffff01cf, 0xffffb628,
+  },
+{ 0x00006cb1, 0x0001256a, 0xffff1453, 0xffffcd6c,
+  0x000088b5, 0x0001462e, 0xffff29d7, 0xffffe7b0,
+  0x0000a7b9, 0x000169f2, 0xffff425b, 0x000004f4,
+  0x0000c9bd, 0x000190b6, 0xffff5ddf, 0x00002538,
+  0x0000eec1, 0x0001ba7a, 0xffff7c63, 0x0000487c,
+  0x000116c5, 0x0001e73e, 0xffff9de7, 0x00006ec0,
+  0x000141c9, 0x00021702, 0xffffc26b, 0xffffdd04,
+  0x0000c3cd, 0x000198c6, 0xffff33ef, 0x00000948,
+  },
+{ 0xfffffdd1, 0x0000d28a, 0xfffe5d73, 0xffff328c,
+  0x000009d5, 0x0000e34e, 0xfffe62f7, 0xffff3cd0,
+  0x000018d9, 0x0000f712, 0xfffe6b7b, 0xffff4a14,
+  0x00002add, 0x00010dd6, 0xfffe76ff, 0xffff5a58,
+  0x00003fe1, 0x0001279a, 0xfffe8583, 0xffff6d9c,
+  0x000057e5, 0x0001445e, 0xfffe9707, 0xffff83e0,
+  0x000072e9, 0x00016422, 0xfffeab8b, 0xffff9d24,
+  0x000090ed, 0x000186e6, 0xfffec30f, 0xffffb968,
+  },
+{ 0x0000b1f1, 0x0001acaa, 0xfffedd93, 0xffffd8ac,
+  0x0000d5f5, 0x0001d56e, 0xfffefb17, 0xfffffaf0,
+  0x0000fcf9, 0x00020132, 0xffff1b9b, 0x00002034,
+  0x000126fd, 0x00022ff6, 0xffff3f1f, 0x00008b36,
+  0x000093c3, 0x00009d80, 0x00009d6d, 0x0000a78a,
+  0x0000b4d7, 0x0000c354, 0x0000b801, 0x0000c6de,
+  0x0000d4eb, 0x0000e828, 0x0000d195, 0xffffea32,
+  0x00000fff, 0x000022fc, 0xfffffc29, 0x00000f86,
+  },
+{ 0xffffee13, 0xfffffcd0, 0xffffc1bd, 0xffffd0da,
+  0xffffe327, 0xfffff6a4, 0xffffb051, 0xffffc42e,
+  0xffffd73b, 0xffffef78, 0xffff9de5, 0xffffb682,
+  0xffffd24f, 0xffffef4c, 0xffff9279, 0xffffafd6,
+  0xffffd063, 0xfffff220, 0xffff8a0d, 0xffffac2a,
+  0xffffd177, 0xfffff7f4, 0xffff84a1, 0xffffab7e,
+  0xffffd18b, 0xfffffcc8, 0xffff7e35, 0xffffa9d2,
+  0xffffd89f, 0x0000089c, 0xffff7ec9, 0xffffaf26,
+  },
+{ 0xffffe2b3, 0x00001770, 0xffff825d, 0xffffb77a,
+  0xffffefc7, 0x00002944, 0xffff88f1, 0xffffc2ce,
+  0xfffffbdb, 0x00003a18, 0xffff8e85, 0xffffcd22,
+  0x00000eef, 0x000051ec, 0xffff9b19, 0xffffde76,
+  0x00002503, 0x00006cc0, 0xffffaaad, 0xfffff2ca,
+  0x00003e17, 0x00008a94, 0xffffbd41, 0x00000a1e,
+  0x0000562b, 0x0000a768, 0xffffced5, 0xffffe572,
+  0x0000493f, 0x00009a3c, 0xffffb169, 0x000002c6,
+  },
+{ 0xfffff353, 0x00004410, 0xffff4afd, 0xffff9c1a,
+  0xfffff067, 0x000045e4, 0xffff4191, 0xffff976e,
+  0xffffec7b, 0x000046b8, 0xffff3725, 0xffff91c2,
+  0xffffef8f, 0x00004e8c, 0xffff33b9, 0xffff9316,
+  0xfffff5a3, 0x00005960, 0xffff334d, 0xffff976a,
+  0xfffffeb7, 0x00006734, 0xffff35e1, 0xffff9ebe,
+  0x000006cb, 0x00007408, 0xffff3775, 0xffffa512,
+  0x000015df, 0x000087dc, 0xffff4009, 0xffffb266,
+  },
+{ 0x000027f3, 0x00009eb0, 0xffff4b9d, 0xffffc2ba,
+  0x00003d07, 0x0000b884, 0xffff5a31, 0xffffd60e,
+  0x0000511b, 0x0000d158, 0xffff67c5, 0xffffe862,
+  0x00006c2f, 0x0000f12c, 0xffff7c59, 0x000001b6,
+  0x00008a43, 0x00011400, 0xffff93ed, 0x00001e0a,
+  0x0000ab57, 0x000139d4, 0xffffae81, 0x00003d5e,
+  0x0000cb6b, 0x00015ea8, 0xffffc815, 0xffffe0b2,
+  0x0000867f, 0x0001197c, 0xffff72a9, 0x00000606,
+  },
+{ 0xfffff893, 0x00008b50, 0xfffed43d, 0xffff675a,
+  0xfffffda7, 0x00009524, 0xfffed2d1, 0xffff6aae,
+  0x000001bb, 0x00009df8, 0xfffed065, 0xffff6d02,
+  0x00000ccf, 0x0000adcc, 0xfffed4f9, 0xffff7656,
+  0x00001ae3, 0x0000c0a0, 0xfffedc8d, 0xffff82aa,
+  0x00002bf7, 0x0000d674, 0xfffee721, 0xffff91fe,
+  0x00003c0b, 0x0000eb48, 0xfffef0b5, 0xffffa052,
+  0x0000531f, 0x0001071c, 0xffff0149, 0xffffb5a6,
+  },
+{ 0x00006d33, 0x000125f0, 0xffff14dd, 0xffffcdfa,
+  0x00008a47, 0x000147c4, 0xffff2b71, 0xffffe94e,
+  0x0000a65b, 0x00016898, 0xffff4105, 0x000003a2,
+  0x0000c96f, 0x0001906c, 0xffff5d99, 0x000024f6,
+  0x0000ef83, 0x0001bb40, 0xffff7d2d, 0x0000494a,
+  0x00011897, 0x0001e914, 0xffff9fc1, 0x0000709e,
+  0x000140ab, 0x000215e8, 0xffffc155, 0xffffdbf2,
+  0x0000c3bf, 0x000198bc, 0xffff33e9, 0x00000946,
+  },
+{ 0xfffffdd3, 0x0000d290, 0xfffe5d7d, 0xffff329a,
+  0x00000ae7, 0x0000e464, 0xfffe6411, 0xffff3dee,
+  0x000016fb, 0x0000f538, 0xfffe69a5, 0xffff4842,
+  0x00002a0f, 0x00010d0c, 0xfffe7639, 0xffff5996,
+  0x00004023, 0x000127e0, 0xfffe85cd, 0xffff6dea,
+  0x00005937, 0x000145b4, 0xfffe9861, 0xffff853e,
+  0x0000714b, 0x00016288, 0xfffea9f5, 0xffff9b92,
+  0x0000905f, 0x0001865c, 0xfffec289, 0xffffb8e6,
+  },
+{ 0x0000b273, 0x0001ad30, 0xfffede1d, 0xffffd93a,
+  0x0000d787, 0x0001d704, 0xfffefcb1, 0xfffffc8e,
+  0x0000fb9b, 0x0001ffd8, 0xffff1a45, 0x00001ee2,
+  0x000126af, 0x00022fac, 0xffff3ed9, 0x00008af4,
+  0x00009485, 0x00009e46, 0x00009e37, 0x0000a858,
+  0x0000b6a9, 0x0000c52a, 0x0000b9db, 0x0000c8bc,
+  0x0000d3cd, 0x0000e70e, 0x0000d07f, 0xffffe920,
+  0x00000ff1, 0x000022f2, 0xfffffc23, 0x00000f84,
+  },
diff --git a/tests/tcg/hexagon/v68_hvx.c b/tests/tcg/hexagon/v68_hvx.c
new file mode 100644
index 0000000000..5a196e0155
--- /dev/null
+++ b/tests/tcg/hexagon/v68_hvx.c
@@ -0,0 +1,90 @@
+/*
+ *  Copyright(c) 2022-2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <string.h>
+#include <limits.h>
+
+int err;
+
+#include "hvx_misc.h"
+
+MMVector v6mpy_buffer0[BUFSIZE] __attribute__((aligned(MAX_VEC_SIZE_BYTES)));
+MMVector v6mpy_buffer1[BUFSIZE] __attribute__((aligned(MAX_VEC_SIZE_BYTES)));
+
+static void init_v6mpy_buffers(void)
+{
+    int counter0 = 0;
+    int counter1 = 17;
+    for (int i = 0; i < BUFSIZE; i++) {
+        for (int j = 0; j < MAX_VEC_SIZE_BYTES / 4; j++) {
+            v6mpy_buffer0[i].w[j] = counter0++;
+            v6mpy_buffer1[i].w[j] = counter1++;
+        }
+    }
+}
+
+int v6mpy_ref[BUFSIZE][MAX_VEC_SIZE_BYTES / 4] = {
+#include "v6mpy_ref.h"
+};
+
+static void test_v6mpy(void)
+{
+    void *p00 = buffer0;
+    void *p01 = v6mpy_buffer0;
+    void *p10 = buffer1;
+    void *p11 = v6mpy_buffer1;
+    void *pout = output;
+
+    memset(expect, 0xff, sizeof(expect));
+    memset(output, 0xff, sizeof(expect));
+
+    for (int i = 0; i < BUFSIZE; i++) {
+        asm("v2 = vmem(%0 + #0)\n\t"
+            "v3 = vmem(%1 + #0)\n\t"
+            "v4 = vmem(%2 + #0)\n\t"
+            "v5 = vmem(%3 + #0)\n\t"
+            "v5:4.w = v6mpy(v5:4.ub, v3:2.b, #1):v\n\t"
+            "vmem(%4 + #0) = v4\n\t"
+            : : "r"(p00), "r"(p01), "r"(p10), "r"(p11), "r"(pout)
+            : "v2", "v3", "v4", "v5", "memory");
+        p00 += sizeof(MMVector);
+        p01 += sizeof(MMVector);
+        p10 += sizeof(MMVector);
+        p11 += sizeof(MMVector);
+        pout += sizeof(MMVector);
+
+        for (int j = 0; j < MAX_VEC_SIZE_BYTES / 4; j++) {
+            expect[i].w[j] = v6mpy_ref[i][j];
+        }
+    }
+
+    check_output_w(__LINE__, BUFSIZE);
+}
+
+int main()
+{
+    init_buffers();
+    init_v6mpy_buffers();
+
+    test_v6mpy();
+
+    puts(err ? "FAIL" : "PASS");
+    return err ? 1 : 0;
+}
diff --git a/tests/tcg/hexagon/Makefile.target b/tests/tcg/hexagon/Makefile.target
index b7529e23bc..cd834f35c5 100644
--- a/tests/tcg/hexagon/Makefile.target
+++ b/tests/tcg/hexagon/Makefile.target
@@ -77,6 +77,7 @@ HEX_TESTS += test_vpmpyh
 HEX_TESTS += test_vspliceb
 
 HEX_TESTS += v68_scalar
+HEX_TESTS += v68_hvx
 
 TESTS += $(HEX_TESTS)
 
@@ -92,6 +93,8 @@ vector_add_int: CFLAGS += -mhvx -fvectorize
 hvx_misc: hvx_misc.c hvx_misc.h
 hvx_misc: CFLAGS += -mhvx
 hvx_histogram: CFLAGS += -mhvx -Wno-gnu-folding-constant
+v68_hvx: v68_hvx.c hvx_misc.h v6mpy_ref.h
+v68_hvx: CFLAGS += -mhvx -Wno-unused-function
 
 hvx_histogram: hvx_histogram.c hvx_histogram_row.S
 	$(CC) $(CFLAGS) $(CROSS_CC_GUEST_CFLAGS) $^ -o $@ $(LDFLAGS)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/9] Hexagon (target/hexagon) Add v69 HVX instructions
  2023-04-26  2:30 [PATCH 0/9] Hexagon (target/hexagon) New architecture support Taylor Simpson
                   ` (4 preceding siblings ...)
  2023-04-26  2:30 ` [PATCH 5/9] Hexagon (tests/tcg/hexagon) Add v68 HVX tests Taylor Simpson
@ 2023-04-26  2:30 ` Taylor Simpson
  2023-04-27 13:56   ` Anton Johansson via
  2023-04-26  2:30 ` [PATCH 7/9] Hexagon (tests/tcg/hexagon) Add v69 HVX tests Taylor Simpson
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Taylor Simpson @ 2023-04-26  2:30 UTC (permalink / raw
  To: qemu-devel
  Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
	quic_mathbern

The following instructions are added
    V6_vasrvuhubrndsat
    V6_vasrvuhubsat
    V6_vasrvwuhrndsat
    V6_vasrvwuhsat
    V6_vassign_tmp
    V6_vcombine_tmp
    V6_vmpyuhvs

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/gen_tcg_hvx.h                 | 12 ++++++
 target/hexagon/attribs_def.h.inc             |  8 ++++
 target/hexagon/imported/mmvec/encode_ext.def |  8 ++++
 target/hexagon/imported/mmvec/ext.idef       | 40 ++++++++++++++++++++
 4 files changed, 68 insertions(+)

diff --git a/target/hexagon/gen_tcg_hvx.h b/target/hexagon/gen_tcg_hvx.h
index d4aefe8e3f..8dceead5e5 100644
--- a/target/hexagon/gen_tcg_hvx.h
+++ b/target/hexagon/gen_tcg_hvx.h
@@ -128,6 +128,18 @@ static inline void assert_vhist_tmp(DisasContext *ctx)
     tcg_gen_gvec_mov(MO_64, VdV_off, VuV_off, \
                      sizeof(MMVector), sizeof(MMVector))
 
+#define fGEN_TCG_V6_vassign_tmp(SHORTCODE) \
+    tcg_gen_gvec_mov(MO_64, VdV_off, VuV_off, \
+                     sizeof(MMVector), sizeof(MMVector))
+
+#define fGEN_TCG_V6_vcombine_tmp(SHORTCODE) \
+    do { \
+        tcg_gen_gvec_mov(MO_64, VddV_off, VvV_off, \
+                         sizeof(MMVector), sizeof(MMVector)); \
+        tcg_gen_gvec_mov(MO_64, VddV_off + sizeof(MMVector), VuV_off, \
+                         sizeof(MMVector), sizeof(MMVector)); \
+    } while (0)
+
 /* Vector conditional move */
 #define fGEN_TCG_VEC_CMOV(PRED) \
     do { \
diff --git a/target/hexagon/attribs_def.h.inc b/target/hexagon/attribs_def.h.inc
index 0ddfb45bdf..3bef60bef3 100644
--- a/target/hexagon/attribs_def.h.inc
+++ b/target/hexagon/attribs_def.h.inc
@@ -69,11 +69,13 @@ DEF_ATTRIB(CVI_VP_VS, "Double vector permute/shft insn executes on HVX", "", "")
 DEF_ATTRIB(CVI_VX, "Multiply instruction executes on HVX", "", "")
 DEF_ATTRIB(CVI_VX_DV, "Double vector multiply insn executes on HVX", "", "")
 DEF_ATTRIB(CVI_VS, "Shift instruction executes on HVX", "", "")
+DEF_ATTRIB(CVI_VS_3SRC, "This shift needs to borrow a source register", "", "")
 DEF_ATTRIB(CVI_VS_VX, "Permute/shift and multiply insn executes on HVX", "", "")
 DEF_ATTRIB(CVI_VA, "ALU instruction executes on HVX", "", "")
 DEF_ATTRIB(CVI_VA_DV, "Double vector alu instruction executes on HVX", "", "")
 DEF_ATTRIB(CVI_4SLOT, "Consumes all the vector execution resources", "", "")
 DEF_ATTRIB(CVI_TMP, "Transient Memory Load not written to register", "", "")
+DEF_ATTRIB(CVI_REMAP, "Register Renaming not written to register file", "", "")
 DEF_ATTRIB(CVI_GATHER, "CVI Gather operation", "", "")
 DEF_ATTRIB(CVI_SCATTER, "CVI Scatter operation", "", "")
 DEF_ATTRIB(CVI_SCATTER_RELEASE, "CVI Store Release for scatter", "", "")
@@ -147,6 +149,8 @@ DEF_ATTRIB(L2FETCH, "Instruction is l2fetch type", "", "")
 DEF_ATTRIB(ICINVA, "icinva", "", "")
 DEF_ATTRIB(DCCLEANINVA, "dccleaninva", "", "")
 
+DEF_ATTRIB(NO_INTRINSIC, "Don't generate an intrisic", "", "")
+
 /* Documentation Notes */
 DEF_ATTRIB(NOTE_CONDITIONAL, "can be conditionally executed", "", "")
 DEF_ATTRIB(NOTE_NEWVAL_SLOT0, "New-value oprnd must execute on slot 0", "", "")
@@ -155,7 +159,11 @@ DEF_ATTRIB(NOTE_NOPACKET, "solo instruction", "", "")
 DEF_ATTRIB(NOTE_AXOK, "May only be grouped with ALU32 or non-FP XTYPE.", "", "")
 DEF_ATTRIB(NOTE_LATEPRED, "The predicate can not be used as a .new", "", "")
 DEF_ATTRIB(NOTE_NVSLOT0, "Can execute only in slot 0 (ST)", "", "")
+DEF_ATTRIB(NOTE_NOVP, "Cannot be paired with a HVX permute instruction", "", "")
+DEF_ATTRIB(NOTE_VA_UNARY, "Combined with HVX ALU op (must be unary)", "", "")
 
+/* V6 MMVector Notes for Documentation */
+DEF_ATTRIB(NOTE_SHIFT_RESOURCE, "Uses the HVX shift resource.", "", "")
 /* Restrictions to make note of */
 DEF_ATTRIB(RESTRICT_NOSLOT1_STORE, "Packet must not have slot 1 store", "", "")
 DEF_ATTRIB(RESTRICT_LATEPRED, "Predicate can not be used as a .new.", "", "")
diff --git a/target/hexagon/imported/mmvec/encode_ext.def b/target/hexagon/imported/mmvec/encode_ext.def
index b9b62fef8d..402438f566 100644
--- a/target/hexagon/imported/mmvec/encode_ext.def
+++ b/target/hexagon/imported/mmvec/encode_ext.def
@@ -257,6 +257,11 @@ DEF_ENC(V6_vasruhubrndsat,         ICLASS_CJ" 1 000 vvv vvttt PP 0 uuuuu 111 ddd
 DEF_ENC(V6_vasruwuhsat,         ICLASS_CJ" 1 000 vvv vvttt PP 1 uuuuu 100 ddddd") //
 DEF_ENC(V6_vasruhubsat,            ICLASS_CJ" 1 000 vvv vvttt PP 1 uuuuu 101 ddddd") //
 
+DEF_ENC(V6_vasrvuhubrndsat,"00011101000vvvvvPP0uuuuu011ddddd")
+DEF_ENC(V6_vasrvuhubsat,"00011101000vvvvvPP0uuuuu010ddddd")
+DEF_ENC(V6_vasrvwuhrndsat,"00011101000vvvvvPP0uuuuu001ddddd")
+DEF_ENC(V6_vasrvwuhsat,"00011101000vvvvvPP0uuuuu000ddddd")
+
 /***************************************************************
 *
 *  Group #1, Uses Q6 Rt32
@@ -716,6 +721,7 @@ DEF_ENC(V6_vaddclbw,    ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 001 ddddd") //
 
 DEF_ENC(V6_vavguw,        ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 010 ddddd") //
 DEF_ENC(V6_vavguwrnd,    ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 011 ddddd") //
+DEF_ENC(V6_vassign_tmp,"00011110--0---01PP0uuuuu110ddddd")
 DEF_ENC(V6_vavgb,        ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 100 ddddd") //
 DEF_ENC(V6_vavgbrnd,    ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 101 ddddd") //
 DEF_ENC(V6_vnavgb,        ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 110 ddddd") //
@@ -741,6 +747,7 @@ DEF_ENC(V6_vshufoh,      ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 100 ddddd") //
 DEF_ENC(V6_vshufoeh,      ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 101 ddddd") //
 DEF_ENC(V6_vshufoeb,      ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 110 ddddd") //
 DEF_ENC(V6_vcombine,     ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 111 ddddd") //
+DEF_ENC(V6_vcombine_tmp,"00011110101vvvvvPP0uuuuu111ddddd")
 
 DEF_ENC(V6_v6mpyvubs10,  ICLASS_CJ" 1 111 010 vvvvv PP 1 uuuuu 0ii ddddd")
 DEF_ENC(V6_v6mpyhubs10,  ICLASS_CJ" 1 111 010 vvvvv PP 1 uuuuu 1ii ddddd")
@@ -795,6 +802,7 @@ DEF_ENC(V6_vrounduhub,     ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 011 ddddd") //
 DEF_ENC(V6_vrounduwuh,     ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 100 ddddd") //
 DEF_ENC(V6_vmpyewuh,    ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 101 ddddd")
 DEF_ENC(V6_vmpyowh,        ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 111 ddddd")
+DEF_ENC(V6_vmpyuhvs,"00011111110vvvvvPP1uuuuu111ddddd")
 
 
 #endif /* NO MMVEC */
diff --git a/target/hexagon/imported/mmvec/ext.idef b/target/hexagon/imported/mmvec/ext.idef
index c0d169fd4f..ead32c243b 100644
--- a/target/hexagon/imported/mmvec/ext.idef
+++ b/target/hexagon/imported/mmvec/ext.idef
@@ -62,6 +62,9 @@ EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VS),  \
 DESCR, DO_FOR_EACH_CODE(WIDTH, CODE))
 
 
+#define ITERATOR_INSN_SHIFT3_SLOT(WIDTH,TAG,SYNTAX,DESCR,CODE) \
+EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VS,A_CVI_VS_3SRC,A_NOTE_SHIFT_RESOURCE,A_NOTE_NOVP,A_NOTE_VA_UNARY),  \
+DESCR, DO_FOR_EACH_CODE(WIDTH, CODE))
 
 #define ITERATOR_INSN_SHIFT_SLOT_VV_LATE(WIDTH,TAG,SYNTAX,DESCR,CODE) \
 EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VS),  \
@@ -980,6 +983,22 @@ NARROWING_SHIFT(16,vasrhubrndsat,fSETBYTE,ub,h,:rnd:sat,fVSATUB,fVROUND,0x7)
 NARROWING_SHIFT(16,vasrhbsat,fSETBYTE,b,h,:sat,fVSATB,fVNOROUND,0x7)
 NARROWING_SHIFT(16,vasrhbrndsat,fSETBYTE,b,h,:rnd:sat,fVSATB,fVROUND,0x7)
 
+#define NARROWING_VECTOR_SHIFT(ITERSIZE,TAG,DSTM,DSTTYPE,SRCTYPE,SRCTYPE2,SYNOPTS,SATFUNC,RNDFUNC,SHAMTMASK) \
+ITERATOR_INSN_SHIFT3_SLOT(ITERSIZE,TAG, \
+"Vd32." #DSTTYPE "=vasr(Vuu32." #SRCTYPE ",Vv32." #SRCTYPE2 ")" #SYNOPTS, \
+"Vector shift by vector right and shuffle", \
+    fHIDE(int )shamt = VvV.SRCTYPE2[2*i+0] & SHAMTMASK; \
+    DSTM(0,VdV.SRCTYPE[i],SATFUNC(RNDFUNC(VuuV.v[0].SRCTYPE[i],shamt) >> shamt)); \
+    shamt = VvV.SRCTYPE2[2*i+1] & SHAMTMASK; \
+    DSTM(1,VdV.SRCTYPE[i],SATFUNC(RNDFUNC(VuuV.v[1].SRCTYPE[i],shamt) >> shamt)))
+
+/* WORD TO HALF*/
+NARROWING_VECTOR_SHIFT(32,vasrvwuhsat,fSETHALF,uh,w,uh,:sat,fVSATUH,fVNOROUND,0xF)
+NARROWING_VECTOR_SHIFT(32,vasrvwuhrndsat,fSETHALF,uh,w,uh,:rnd:sat,fVSATUH,fVROUND,0xF)
+/* HALF TO BYTE*/
+NARROWING_VECTOR_SHIFT(16,vasrvuhubsat,fSETBYTE,ub,uh,ub,:sat,fVSATUB,fVNOROUND,0x7)
+NARROWING_VECTOR_SHIFT(16,vasrvuhubrndsat,fSETBYTE,ub,uh,ub,:rnd:sat,fVSATUB,fVROUND,0x7)
+
 NARROWING_SHIFT_NOV1(16,vasruhubsat,fSETBYTE,ub,uh,:sat,fVSATUB,fVNOROUND,0x7)
 NARROWING_SHIFT_NOV1(16,vasruhubrndsat,fSETBYTE,ub,uh,:rnd:sat,fVSATUB,fVROUND,0x7)
 
@@ -1364,6 +1383,9 @@ ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyhvsrs,"Vd32=vmpyh(Vu32,Vv32):<<1:rnd:s
 
 
 
+ITERATOR_INSN_MPY_SLOT(16,vmpyuhvs, "Vd32.uh=vmpy(Vu32.uh,Vv32.uh):>>16",
+"Vector by Vector Unsigned Halfword Multiply with 16 bit rightshift",
+    VdV.uh[i] = fGETUHALF(1,fMPY16UU(VuV.uh[i],VvV.uh[i])))
 
 
 ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhus, "Vdd32=vmpyhus(Vu32,Vv32)","Vdd32.w=vmpy(Vu32.h,Vv32.uh)",
@@ -2042,6 +2064,24 @@ ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8,vcombine,"Vdd32=vcombine(Vu32,Vv32)",
 
 ///////////////////////////////////////////////////////////////////////////
 
+EXTINSN(V6_vcombine_tmp, "Vdd32.tmp=vcombine(Vu32,Vv32)",    ATTRIBS(A_EXTENSION,A_CVI,A_CVI_REMAP,A_CVI_TMP,A_NO_INTRINSIC),
+"Vector assign tmp, Any two to Vector Pair ",
+{
+   fHIDE(int i;)
+    fVFOREACH(8, i) {
+           VddV.v[0].ub[i] = VvV.ub[i];
+           VddV.v[1].ub[i] = VuV.ub[i];
+    }
+})
+
+EXTINSN(V6_vassign_tmp, "Vd32.tmp=Vu32",    ATTRIBS(A_EXTENSION,A_CVI,A_CVI_REMAP,A_CVI_TMP,A_NO_INTRINSIC),
+"Vector assign tmp, Any two to Vector Pair ",
+{
+   fHIDE(int i;)
+    fVFOREACH(32, i) {
+           VdV.w[i]=VuV.w[i];
+    }
+})
 
 /*********************************************************
 * GENERAL PERMUTE NETWORKS
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 7/9] Hexagon (tests/tcg/hexagon) Add v69 HVX tests
  2023-04-26  2:30 [PATCH 0/9] Hexagon (target/hexagon) New architecture support Taylor Simpson
                   ` (5 preceding siblings ...)
  2023-04-26  2:30 ` [PATCH 6/9] Hexagon (target/hexagon) Add v69 HVX instructions Taylor Simpson
@ 2023-04-26  2:30 ` Taylor Simpson
  2023-04-27 14:39   ` Anton Johansson via
  2023-04-26  2:30 ` [PATCH 8/9] Hexagon (target/hexagon) Add v73 scalar instructions Taylor Simpson
  2023-04-26  2:30 ` [PATCH 9/9] Hexagon (tests/tcg/hexagon) Add v73 scalar tests Taylor Simpson
  8 siblings, 1 reply; 20+ messages in thread
From: Taylor Simpson @ 2023-04-26  2:30 UTC (permalink / raw
  To: qemu-devel
  Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
	quic_mathbern

The following instructions are tested
    V6_vasrvuhubrndsat
    V6_vasrvuhubsat
    V6_vasrvwuhrndsat
    V6_vasrvwuhsat
    V6_vassign_tmp
    V6_vcombine_tmp
    V6_vmpyuhvs

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 tests/tcg/hexagon/v69_hvx.c       | 318 ++++++++++++++++++++++++++++++
 tests/tcg/hexagon/Makefile.target |   3 +
 2 files changed, 321 insertions(+)
 create mode 100644 tests/tcg/hexagon/v69_hvx.c

diff --git a/tests/tcg/hexagon/v69_hvx.c b/tests/tcg/hexagon/v69_hvx.c
new file mode 100644
index 0000000000..051e5420df
--- /dev/null
+++ b/tests/tcg/hexagon/v69_hvx.c
@@ -0,0 +1,318 @@
+/*
+ *  Copyright(c) 2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <string.h>
+#include <limits.h>
+
+int err;
+
+#include "hvx_misc.h"
+
+#define fVROUND(VAL, SHAMT) \
+    ((VAL) + (((SHAMT) > 0) ? (1LL << ((SHAMT) - 1)) : 0))
+
+#define fVSATUB(VAL) \
+    ((((VAL) & 0xffLL) == (VAL)) ? \
+        (VAL) : \
+        ((((int32_t)(VAL)) < 0) ? 0 : 0xff))
+
+#define fVSATUH(VAL) \
+    ((((VAL) & 0xffffLL) == (VAL)) ? \
+        (VAL) : \
+        ((((int32_t)(VAL)) < 0) ? 0 : 0xffff))
+
+static void test_vasrvuhubrndsat(void)
+{
+    void *p0 = buffer0;
+    void *p1 = buffer1;
+    void *pout = output;
+
+    memset(expect, 0xaa, sizeof(expect));
+    memset(output, 0xbb, sizeof(output));
+
+    for (int i = 0; i < BUFSIZE / 2; i++) {
+        asm("v4 = vmem(%0 + #0)\n\t"
+            "v5 = vmem(%0 + #1)\n\t"
+            "v6 = vmem(%1 + #0)\n\t"
+            "v5.ub = vasr(v5:4.uh, v6.ub):rnd:sat\n\t"
+            "vmem(%2) = v5\n\t"
+            : : "r"(p0), "r"(p1), "r"(pout)
+            : "v4", "v5", "v6", "memory");
+        p0 += sizeof(MMVector) * 2;
+        p1 += sizeof(MMVector);
+        pout += sizeof(MMVector);
+
+        for (int j = 0; j < MAX_VEC_SIZE_BYTES / 2; j++) {
+            int shamt;
+            uint8_t byte0;
+            uint8_t byte1;
+
+            shamt = buffer1[i].ub[2 * j + 0] & 0x7;
+            byte0 = fVSATUB(fVROUND(buffer0[2 * i + 0].uh[j], shamt) >> shamt);
+            shamt = buffer1[i].ub[2 * j + 1] & 0x7;
+            byte1 = fVSATUB(fVROUND(buffer0[2 * i + 1].uh[j], shamt) >> shamt);
+            expect[i].uh[j] = (byte1 << 8) | (byte0 & 0xff);
+        }
+    }
+
+    check_output_h(__LINE__, BUFSIZE / 2);
+}
+
+static void test_vasrvuhubsat(void)
+{
+    void *p0 = buffer0;
+    void *p1 = buffer1;
+    void *pout = output;
+
+    memset(expect, 0xaa, sizeof(expect));
+    memset(output, 0xbb, sizeof(output));
+
+    for (int i = 0; i < BUFSIZE / 2; i++) {
+        asm("v4 = vmem(%0 + #0)\n\t"
+            "v5 = vmem(%0 + #1)\n\t"
+            "v6 = vmem(%1 + #0)\n\t"
+            "v5.ub = vasr(v5:4.uh, v6.ub):sat\n\t"
+            "vmem(%2) = v5\n\t"
+            : : "r"(p0), "r"(p1), "r"(pout)
+            : "v4", "v5", "v6", "memory");
+        p0 += sizeof(MMVector) * 2;
+        p1 += sizeof(MMVector);
+        pout += sizeof(MMVector);
+
+        for (int j = 0; j < MAX_VEC_SIZE_BYTES / 2; j++) {
+            int shamt;
+            uint8_t byte0;
+            uint8_t byte1;
+
+            shamt = buffer1[i].ub[2 * j + 0] & 0x7;
+            byte0 = fVSATUB(buffer0[2 * i + 0].uh[j] >> shamt);
+            shamt = buffer1[i].ub[2 * j + 1] & 0x7;
+            byte1 = fVSATUB(buffer0[2 * i + 1].uh[j] >> shamt);
+            expect[i].uh[j] = (byte1 << 8) | (byte0 & 0xff);
+        }
+    }
+
+    check_output_h(__LINE__, BUFSIZE / 2);
+}
+
+static void test_vasrvwuhrndsat(void)
+{
+    void *p0 = buffer0;
+    void *p1 = buffer1;
+    void *pout = output;
+
+    memset(expect, 0xaa, sizeof(expect));
+    memset(output, 0xbb, sizeof(output));
+
+    for (int i = 0; i < BUFSIZE / 2; i++) {
+        asm("v4 = vmem(%0 + #0)\n\t"
+            "v5 = vmem(%0 + #1)\n\t"
+            "v6 = vmem(%1 + #0)\n\t"
+            "v5.uh = vasr(v5:4.w, v6.uh):rnd:sat\n\t"
+            "vmem(%2) = v5\n\t"
+            : : "r"(p0), "r"(p1), "r"(pout)
+            : "v4", "v5", "v6", "memory");
+        p0 += sizeof(MMVector) * 2;
+        p1 += sizeof(MMVector);
+        pout += sizeof(MMVector);
+
+        for (int j = 0; j < MAX_VEC_SIZE_BYTES / 2; j++) {
+            int shamt;
+            uint16_t half0;
+            uint16_t half1;
+
+            shamt = buffer1[i].uh[2 * j + 0] & 0xf;
+            half0 = fVSATUH(fVROUND(buffer0[2 * i + 0].w[j], shamt) >> shamt);
+            shamt = buffer1[i].uh[2 * j + 1] & 0xf;
+            half1 = fVSATUH(fVROUND(buffer0[2 * i + 1].w[j], shamt) >> shamt);
+            expect[i].w[j] = (half1 << 16) | (half0 & 0xffff);
+        }
+    }
+
+    check_output_w(__LINE__, BUFSIZE / 2);
+}
+
+static void test_vasrvwuhsat(void)
+{
+    void *p0 = buffer0;
+    void *p1 = buffer1;
+    void *pout = output;
+
+    memset(expect, 0xaa, sizeof(expect));
+    memset(output, 0xbb, sizeof(output));
+
+    for (int i = 0; i < BUFSIZE / 2; i++) {
+        asm("v4 = vmem(%0 + #0)\n\t"
+            "v5 = vmem(%0 + #1)\n\t"
+            "v6 = vmem(%1 + #0)\n\t"
+            "v5.uh = vasr(v5:4.w, v6.uh):sat\n\t"
+            "vmem(%2) = v5\n\t"
+            : : "r"(p0), "r"(p1), "r"(pout)
+            : "v4", "v5", "v6", "memory");
+        p0 += sizeof(MMVector) * 2;
+        p1 += sizeof(MMVector);
+        pout += sizeof(MMVector);
+
+        for (int j = 0; j < MAX_VEC_SIZE_BYTES / 2; j++) {
+            int shamt;
+            uint16_t half0;
+            uint16_t half1;
+
+            shamt = buffer1[i].uh[2 * j + 0] & 0xf;
+            half0 = fVSATUH(buffer0[2 * i + 0].w[j] >> shamt);
+            shamt = buffer1[i].uh[2 * j + 1] & 0xf;
+            half1 = fVSATUH(buffer0[2 * i + 1].w[j] >> shamt);
+            expect[i].w[j] = (half1 << 16) | (half0 & 0xffff);
+        }
+    }
+
+    check_output_w(__LINE__, BUFSIZE / 2);
+}
+
+static void test_vassign_tmp(void)
+{
+    void *p0 = buffer0;
+    void *pout = output;
+
+    memset(expect, 0xaa, sizeof(expect));
+    memset(output, 0xbb, sizeof(output));
+
+    for (int i = 0; i < BUFSIZE; i++) {
+        /*
+         * Assign into v12 as .tmp, then use it in the next packet
+         * Should get the new value within the same packet and
+         * the old value in the next packet
+         */
+        asm("v3 = vmem(%0 + #0)\n\t"
+            "r1 = #1\n\t"
+            "v12 = vsplat(r1)\n\t"
+            "r1 = #2\n\t"
+            "v13 = vsplat(r1)\n\t"
+            "{\n\t"
+            "    v12.tmp = v13\n\t"
+            "    v4.w = vadd(v12.w, v3.w)\n\t"
+            "}\n\t"
+            "v4.w = vadd(v4.w, v12.w)\n\t"
+            "vmem(%1 + #0) = v4\n\t"
+            : : "r"(p0), "r"(pout)
+            : "r1", "v3", "v4", "v12", "v13", "memory");
+        p0 += sizeof(MMVector);
+        pout += sizeof(MMVector);
+
+        for (int j = 0; j < MAX_VEC_SIZE_BYTES / 4; j++) {
+            expect[i].w[j] = buffer0[i].w[j] + 3;
+        }
+    }
+
+    check_output_w(__LINE__, BUFSIZE);
+}
+
+static void test_vcombine_tmp(void)
+{
+    void *p0 = buffer0;
+    void *p1 = buffer1;
+    void *pout = output;
+
+    memset(expect, 0xaa, sizeof(expect));
+    memset(output, 0xbb, sizeof(output));
+
+    for (int i = 0; i < BUFSIZE; i++) {
+        /*
+         * Combine into v13:12 as .tmp, then use it in the next packet
+         * Should get the new value within the same packet and
+         * the old value in the next packet
+         */
+        asm("v3 = vmem(%0 + #0)\n\t"
+            "r1 = #1\n\t"
+            "v12 = vsplat(r1)\n\t"
+            "r1 = #2\n\t"
+            "v13 = vsplat(r1)\n\t"
+            "r1 = #3\n\t"
+            "v14 = vsplat(r1)\n\t"
+            "r1 = #4\n\t"
+            "v15 = vsplat(r1)\n\t"
+            "{\n\t"
+            "    v13:12.tmp = vcombine(v15, v14)\n\t"
+            "    v4.w = vadd(v12.w, v3.w)\n\t"
+            "    v16 = v13\n\t"
+            "}\n\t"
+            "v4.w = vadd(v4.w, v12.w)\n\t"
+            "v4.w = vadd(v4.w, v13.w)\n\t"
+            "v4.w = vadd(v4.w, v16.w)\n\t"
+            "vmem(%2 + #0) = v4\n\t"
+            : : "r"(p0), "r"(p1), "r"(pout)
+            : "r1", "v3", "v4", "v12", "v13", "v14", "v15", "v16", "memory");
+        p0 += sizeof(MMVector);
+        p1 += sizeof(MMVector);
+        pout += sizeof(MMVector);
+
+        for (int j = 0; j < MAX_VEC_SIZE_BYTES / 4; j++) {
+            expect[i].w[j] = buffer0[i].w[j] + 10;
+        }
+    }
+
+    check_output_w(__LINE__, BUFSIZE);
+}
+
+static void test_vmpyuhvs(void)
+{
+    void *p0 = buffer0;
+    void *p1 = buffer1;
+    void *pout = output;
+
+    memset(expect, 0xaa, sizeof(expect));
+    memset(output, 0xbb, sizeof(output));
+
+    for (int i = 0; i < BUFSIZE; i++) {
+        asm("v4 = vmem(%0 + #0)\n\t"
+            "v5 = vmem(%1 + #0)\n\t"
+            "v4.uh = vmpy(V4.uh, v5.uh):>>16\n\t"
+            "vmem(%2) = v4\n\t"
+            : : "r"(p0), "r"(p1), "r"(pout)
+            : "v4", "v5", "memory");
+        p0 += sizeof(MMVector);
+        p1 += sizeof(MMVector);
+        pout += sizeof(MMVector);
+
+        for (int j = 0; j < MAX_VEC_SIZE_BYTES / 2; j++) {
+            expect[i].uh[j] = (buffer0[i].uh[j] * buffer1[i].uh[j]) >> 16;
+        }
+    }
+
+    check_output_h(__LINE__, BUFSIZE);
+}
+
+int main()
+{
+    init_buffers();
+
+    test_vasrvuhubrndsat();
+    test_vasrvuhubsat();
+    test_vasrvwuhrndsat();
+    test_vasrvwuhsat();
+
+    test_vassign_tmp();
+    test_vcombine_tmp();
+
+    test_vmpyuhvs();
+
+    puts(err ? "FAIL" : "PASS");
+    return err ? 1 : 0;
+}
diff --git a/tests/tcg/hexagon/Makefile.target b/tests/tcg/hexagon/Makefile.target
index cd834f35c5..8cd95cb4a7 100644
--- a/tests/tcg/hexagon/Makefile.target
+++ b/tests/tcg/hexagon/Makefile.target
@@ -78,6 +78,7 @@ HEX_TESTS += test_vspliceb
 
 HEX_TESTS += v68_scalar
 HEX_TESTS += v68_hvx
+HEX_TESTS += v69_hvx
 
 TESTS += $(HEX_TESTS)
 
@@ -95,6 +96,8 @@ hvx_misc: CFLAGS += -mhvx
 hvx_histogram: CFLAGS += -mhvx -Wno-gnu-folding-constant
 v68_hvx: v68_hvx.c hvx_misc.h v6mpy_ref.h
 v68_hvx: CFLAGS += -mhvx -Wno-unused-function
+v69_hvx: v69_hvx.c hvx_misc.h
+v69_hvx: CFLAGS += -mhvx -Wno-unused-function
 
 hvx_histogram: hvx_histogram.c hvx_histogram_row.S
 	$(CC) $(CFLAGS) $(CROSS_CC_GUEST_CFLAGS) $^ -o $@ $(LDFLAGS)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 8/9] Hexagon (target/hexagon) Add v73 scalar instructions
  2023-04-26  2:30 [PATCH 0/9] Hexagon (target/hexagon) New architecture support Taylor Simpson
                   ` (6 preceding siblings ...)
  2023-04-26  2:30 ` [PATCH 7/9] Hexagon (tests/tcg/hexagon) Add v69 HVX tests Taylor Simpson
@ 2023-04-26  2:30 ` Taylor Simpson
  2023-04-27 14:48   ` Anton Johansson via
  2023-04-26  2:30 ` [PATCH 9/9] Hexagon (tests/tcg/hexagon) Add v73 scalar tests Taylor Simpson
  8 siblings, 1 reply; 20+ messages in thread
From: Taylor Simpson @ 2023-04-26  2:30 UTC (permalink / raw
  To: qemu-devel
  Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
	quic_mathbern

The following instructions are added
    J2_callrh
    J2_junprh

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/gen_tcg.h              | 4 ++++
 target/hexagon/attribs_def.h.inc      | 1 +
 target/hexagon/imported/branch.idef   | 7 ++++++-
 target/hexagon/imported/encode_pp.def | 2 ++
 4 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 598d80d3ce..6f12f665db 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -653,6 +653,8 @@
     gen_call(ctx, riV)
 #define fGEN_TCG_J2_callr(SHORTCODE) \
     gen_callr(ctx, RsV)
+#define fGEN_TCG_J2_callrh(SHORTCODE) \
+    gen_callr(ctx, RsV)
 
 #define fGEN_TCG_J2_callt(SHORTCODE) \
     gen_cond_call(ctx, PuV, TCG_COND_EQ, riV)
@@ -851,6 +853,8 @@
     gen_jump(ctx, riV)
 #define fGEN_TCG_J2_jumpr(SHORTCODE) \
     gen_jumpr(ctx, RsV)
+#define fGEN_TCG_J2_jumprh(SHORTCODE) \
+    gen_jumpr(ctx, RsV)
 #define fGEN_TCG_J4_jumpseti(SHORTCODE) \
     do { \
         tcg_gen_movi_tl(RdV, UiV); \
diff --git a/target/hexagon/attribs_def.h.inc b/target/hexagon/attribs_def.h.inc
index 3bef60bef3..69da9776f0 100644
--- a/target/hexagon/attribs_def.h.inc
+++ b/target/hexagon/attribs_def.h.inc
@@ -89,6 +89,7 @@ DEF_ATTRIB(JUMP, "Jump-type instruction", "", "")
 DEF_ATTRIB(INDIRECT, "Absolute register jump", "", "")
 DEF_ATTRIB(CALL, "Function call instruction", "", "")
 DEF_ATTRIB(COF, "Change-of-flow instruction", "", "")
+DEF_ATTRIB(HINTED_COF, "This instruction is a hinted change-of-flow", "", "")
 DEF_ATTRIB(CONDEXEC, "May be cancelled by a predicate", "", "")
 DEF_ATTRIB(DOTNEWVALUE, "Uses a register value generated in this pkt", "", "")
 DEF_ATTRIB(NEWCMPJUMP, "Compound compare and jump", "", "")
diff --git a/target/hexagon/imported/branch.idef b/target/hexagon/imported/branch.idef
index 88f5f48cce..93e2e375a5 100644
--- a/target/hexagon/imported/branch.idef
+++ b/target/hexagon/imported/branch.idef
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -34,6 +34,9 @@ Q6INSN(J2_jump,"jump #r22:2",ATTRIBS(A_JDIR), "direct unconditional jump",
 Q6INSN(J2_jumpr,"jumpr Rs32",ATTRIBS(A_JINDIR), "indirect unconditional jump",
 {fJUMPR(RsN,RsV,COF_TYPE_JUMPR);})
 
+Q6INSN(J2_jumprh,"jumprh Rs32",ATTRIBS(A_JINDIR, A_HINTED_COF), "indirect unconditional jump",
+{fJUMPR(RsN,RsV,COF_TYPE_JUMPR);})
+
 #define OLDCOND_JUMP(TAG,OPER,OPER2,ATTRIB,DESCR,SEMANTICS) \
 Q6INSN(TAG##t,"if (Pu4) "OPER":nt "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBOLD(PuV),,SPECULATE_NOT_TAKEN,12,0); if (fLSBOLD(PuV)) { SEMANTICS; }}) \
 Q6INSN(TAG##f,"if (!Pu4) "OPER":nt "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBOLDNOT(PuV),,SPECULATE_NOT_TAKEN,12,0); if (fLSBOLDNOT(PuV)) { SEMANTICS; }}) \
@@ -196,6 +199,8 @@ Q6INSN(J2_callrt,"if (Pu4) callr Rs32",ATTRIBS(CINDIR_STD),"indirect conditional
 Q6INSN(J2_callrf,"if (!Pu4) callr Rs32",ATTRIBS(CINDIR_STD),"indirect conditional call if false",
 {fBRANCH_SPECULATE_STALL(fLSBOLDNOT(PuV),,SPECULATE_NOT_TAKEN,12,0);if (fLSBOLDNOT(PuV)) { fCALLR(RsV); }})
 
+Q6INSN(J2_callrh,"callrh Rs32",ATTRIBS(CINDIR_STD, A_HINTED_COF), "hinted indirect unconditional call",
+{ fCALLR(RsV); })
 
 
 
diff --git a/target/hexagon/imported/encode_pp.def b/target/hexagon/imported/encode_pp.def
index 763f465bfd..0cd30a5e85 100644
--- a/target/hexagon/imported/encode_pp.def
+++ b/target/hexagon/imported/encode_pp.def
@@ -524,6 +524,7 @@ DEF_FIELD32(ICLASS_J" 110- -------- PP-!---- --------",J_PT,"Predict-taken")
 
 DEF_FIELDROW_DESC32(ICLASS_J" 0000 -------- PP------ --------","[#0] PC=(Rs), R31=return")
 DEF_ENC32(J2_callr,     ICLASS_J" 0000  101sssss  PP------  --------")
+DEF_ENC32(J2_callrh,    ICLASS_J" 0000  110sssss  PP------  --------")
 
 DEF_FIELDROW_DESC32(ICLASS_J" 0001 -------- PP------ --------","[#1] if (Pu) PC=(Rs), R31=return")
 DEF_ENC32(J2_callrt,    ICLASS_J" 0001  000sssss  PP----uu  --------")
@@ -531,6 +532,7 @@ DEF_ENC32(J2_callrf,    ICLASS_J" 0001  001sssss  PP----uu  --------")
 
 DEF_FIELDROW_DESC32(ICLASS_J" 0010 -------- PP------ --------","[#2] PC=(Rs); ")
 DEF_ENC32(J2_jumpr,      ICLASS_J" 0010  100sssss  PP------  --------")
+DEF_ENC32(J2_jumprh,     ICLASS_J" 0010  110sssss  PP------  --------")
 DEF_ENC32(J4_hintjumpr,  ICLASS_J" 0010  101sssss  PP------  --------")
 
 DEF_FIELDROW_DESC32(ICLASS_J" 0011 -------- PP------ --------","[#3] if (Pu) PC=(Rs) ")
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 9/9] Hexagon (tests/tcg/hexagon) Add v73 scalar tests
  2023-04-26  2:30 [PATCH 0/9] Hexagon (target/hexagon) New architecture support Taylor Simpson
                   ` (7 preceding siblings ...)
  2023-04-26  2:30 ` [PATCH 8/9] Hexagon (target/hexagon) Add v73 scalar instructions Taylor Simpson
@ 2023-04-26  2:30 ` Taylor Simpson
  2023-04-27 15:02   ` Anton Johansson via
  8 siblings, 1 reply; 20+ messages in thread
From: Taylor Simpson @ 2023-04-26  2:30 UTC (permalink / raw
  To: qemu-devel
  Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
	quic_mathbern

Tests added for the following instructions
    J2_callrh
    J2_jumprh

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 tests/tcg/hexagon/v73_scalar.c    | 96 +++++++++++++++++++++++++++++++
 tests/tcg/hexagon/Makefile.target |  2 +
 2 files changed, 98 insertions(+)
 create mode 100644 tests/tcg/hexagon/v73_scalar.c

diff --git a/tests/tcg/hexagon/v73_scalar.c b/tests/tcg/hexagon/v73_scalar.c
new file mode 100644
index 0000000000..fee67fc531
--- /dev/null
+++ b/tests/tcg/hexagon/v73_scalar.c
@@ -0,0 +1,96 @@
+/*
+ *  Copyright(c) 2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+#include <stdbool.h>
+#include <stdint.h>
+
+/*
+ *  Test the scalar core instructions that are new in v73
+ */
+
+int err;
+
+static void __check32(int line, uint32_t result, uint32_t expect)
+{
+    if (result != expect) {
+        printf("ERROR at line %d: 0x%08x != 0x%08x\n",
+               line, result, expect);
+        err++;
+    }
+}
+
+#define check32(RES, EXP) __check32(__LINE__, RES, EXP)
+
+static void __check64(int line, uint64_t result, uint64_t expect)
+{
+    if (result != expect) {
+        printf("ERROR at line %d: 0x%016llx != 0x%016llx\n",
+               line, result, expect);
+        err++;
+    }
+}
+
+#define check64(RES, EXP) __check64(__LINE__, RES, EXP)
+
+static bool my_func_called;
+
+static void my_func(void)
+{
+    my_func_called = true;
+}
+
+static inline void callrh(void *func)
+{
+    asm volatile("callrh %0\n\t"
+                 : : "r"(func)
+                 /* Mark the caller-save registers as clobbered */
+                 : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", "r8", "r9",
+                   "r10", "r11", "r12", "r13", "r14", "r15", "r28",
+                   "p0", "p1", "p2", "p3");
+}
+
+static void test_callrh(void)
+{
+    my_func_called = false;
+    callrh(&my_func);
+    check32(my_func_called, true);
+}
+
+static void test_jumprh(void)
+{
+    uint32_t res;
+    asm ("%0 = #5\n\t"
+         "r0 = ##1f\n\t"
+         "jumprh r0\n\t"
+         "%0 = #3\n\t"
+         "jump 2f\n\t"
+         "1:\n\t"
+         "%0 = #1\n\t"
+         "2:\n\t"
+         : "=r"(res) : : "r0");
+    check32(res, 1);
+}
+
+int main()
+{
+    test_callrh();
+    test_jumprh();
+
+    puts(err ? "FAIL" : "PASS");
+    return err ? 1 : 0;
+}
diff --git a/tests/tcg/hexagon/Makefile.target b/tests/tcg/hexagon/Makefile.target
index 8cd95cb4a7..28ef509689 100644
--- a/tests/tcg/hexagon/Makefile.target
+++ b/tests/tcg/hexagon/Makefile.target
@@ -79,6 +79,7 @@ HEX_TESTS += test_vspliceb
 HEX_TESTS += v68_scalar
 HEX_TESTS += v68_hvx
 HEX_TESTS += v69_hvx
+HEX_TESTS += v73_scalar
 
 TESTS += $(HEX_TESTS)
 
@@ -98,6 +99,7 @@ v68_hvx: v68_hvx.c hvx_misc.h v6mpy_ref.h
 v68_hvx: CFLAGS += -mhvx -Wno-unused-function
 v69_hvx: v69_hvx.c hvx_misc.h
 v69_hvx: CFLAGS += -mhvx -Wno-unused-function
+v73_scalar: CFLAGS += -Wno-unused-function
 
 hvx_histogram: hvx_histogram.c hvx_histogram_row.S
 	$(CC) $(CFLAGS) $(CROSS_CC_GUEST_CFLAGS) $^ -o $@ $(LDFLAGS)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/9] Hexagon (target/hexagon) Add support for v68/v69/v71/v73
  2023-04-26  2:30 ` [PATCH 1/9] Hexagon (target/hexagon) Add support for v68/v69/v71/v73 Taylor Simpson
@ 2023-04-26 18:06   ` Anton Johansson via
  2023-04-26 20:27     ` Taylor Simpson
  0 siblings, 1 reply; 20+ messages in thread
From: Anton Johansson via @ 2023-04-26 18:06 UTC (permalink / raw
  To: Taylor Simpson, qemu-devel
  Cc: richard.henderson, philmd, ale, bcain, quic_mathbern


On 4/26/23 04:30, Taylor Simpson wrote:
> Add support for the ELF flags
> Move target/hexagon/cpu.[ch] to be v73
> Change the compiler flag used by "make check-tcg"
>
> The decbin instruction is removed in Hexagon v73, so check the
> version before trying to compile the instruction.
>
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>   configure                         |  2 +-
>   linux-user/hexagon/target_elf.h   | 13 +++++++++----
>   target/hexagon/cpu.h              |  4 ++++
>   target/hexagon/cpu.c              | 20 ++++++++++++++++++++
>   tests/tcg/hexagon/misc.c          | 12 ++++++++++++
>   tests/tcg/hexagon/Makefile.target |  3 +++
>   6 files changed, 49 insertions(+), 5 deletions(-)
>
> diff --git a/configure b/configure
> index 77c03315f8..01fa77f6c7 100755
> --- a/configure
> +++ b/configure
> @@ -1857,7 +1857,7 @@ fi
>   : ${cross_cc_armeb="$cross_cc_arm"}
>   : ${cross_cc_cflags_armeb="-mbig-endian"}
>   : ${cross_cc_hexagon="hexagon-unknown-linux-musl-clang"}
> -: ${cross_cc_cflags_hexagon="-mv67 -O2 -static"}
> +: ${cross_cc_cflags_hexagon="-mv73 -O2 -static"}
>   : ${cross_cc_cflags_i386="-m32"}
>   : ${cross_cc_cflags_ppc="-m32 -mbig-endian"}
>   : ${cross_cc_cflags_ppc64="-m64 -mbig-endian"}
> diff --git a/linux-user/hexagon/target_elf.h b/linux-user/hexagon/target_elf.h
> index b4e9f40527..a0271a0a2a 100644
> --- a/linux-user/hexagon/target_elf.h
> +++ b/linux-user/hexagon/target_elf.h
> @@ -1,5 +1,5 @@
>   /*
> - *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
> + *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
>    *
>    *  This program is free software; you can redistribute it and/or modify
>    *  it under the terms of the GNU General Public License as published by
> @@ -20,7 +20,7 @@
>   
>   static inline const char *cpu_get_model(uint32_t eflags)
>   {
> -    /* For now, treat anything newer than v5 as a v67 */
> +    /* For now, treat anything newer than v5 as a v73 */
>       /* FIXME - Disable instructions that are newer than the specified arch */
>       if (eflags == 0x04 ||    /* v5  */
>           eflags == 0x05 ||    /* v55 */
> @@ -30,9 +30,14 @@ static inline const char *cpu_get_model(uint32_t eflags)
>           eflags == 0x65 ||    /* v65 */
>           eflags == 0x66 ||    /* v66 */
>           eflags == 0x67 ||    /* v67 */
> -        eflags == 0x8067     /* v67t */
> +        eflags == 0x8067 ||  /* v67t */
> +        eflags == 0x68 ||    /* v68 */
> +        eflags == 0x69 ||    /* v69 */
> +        eflags == 0x71 ||    /* v71 */
> +        eflags == 0x8071 ||  /* v71t */
> +        eflags == 0x73       /* v73 */
>          ) {
> -        return "v67";
> +        return "v73";
>       }
>       return "unknown";
>   }
> diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
> index 81b663ecfb..4d8981d862 100644
> --- a/target/hexagon/cpu.h
> +++ b/target/hexagon/cpu.h
> @@ -43,6 +43,10 @@
>   #define CPU_RESOLVING_TYPE TYPE_HEXAGON_CPU
>   
>   #define TYPE_HEXAGON_CPU_V67 HEXAGON_CPU_TYPE_NAME("v67")
> +#define TYPE_HEXAGON_CPU_V68 HEXAGON_CPU_TYPE_NAME("v68")
> +#define TYPE_HEXAGON_CPU_V69 HEXAGON_CPU_TYPE_NAME("v69")
> +#define TYPE_HEXAGON_CPU_V71 HEXAGON_CPU_TYPE_NAME("v71")
> +#define TYPE_HEXAGON_CPU_V73 HEXAGON_CPU_TYPE_NAME("v73")
>   
>   #define MMU_USER_IDX 0
>   
> diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
> index ab40cfc283..8699db8c24 100644
> --- a/target/hexagon/cpu.c
> +++ b/target/hexagon/cpu.c
> @@ -29,6 +29,22 @@ static void hexagon_v67_cpu_init(Object *obj)
>   {
>   }
>   
> +static void hexagon_v68_cpu_init(Object *obj)
> +{
> +}
> +
> +static void hexagon_v69_cpu_init(Object *obj)
> +{
> +}
> +
> +static void hexagon_v71_cpu_init(Object *obj)
> +{
> +}
> +
> +static void hexagon_v73_cpu_init(Object *obj)
> +{
> +}
> +
>   static ObjectClass *hexagon_cpu_class_by_name(const char *cpu_model)
>   {
>       ObjectClass *oc;
> @@ -382,6 +398,10 @@ static const TypeInfo hexagon_cpu_type_infos[] = {
>           .class_init = hexagon_cpu_class_init,
>       },
>       DEFINE_CPU(TYPE_HEXAGON_CPU_V67,              hexagon_v67_cpu_init),
> +    DEFINE_CPU(TYPE_HEXAGON_CPU_V68,              hexagon_v68_cpu_init),
> +    DEFINE_CPU(TYPE_HEXAGON_CPU_V69,              hexagon_v69_cpu_init),
> +    DEFINE_CPU(TYPE_HEXAGON_CPU_V71,              hexagon_v71_cpu_init),
> +    DEFINE_CPU(TYPE_HEXAGON_CPU_V73,              hexagon_v73_cpu_init),

The large spacing to hexagon_v*_cpu_init looks a bit odd.

Also, do we need to provide a *_cpu_init() stub for each version? Seems 
from qom/object.c like we should be able to
just default initialize it

Otherwise,

Reviewed-by: Anton Johansson <anjo@rev.ng>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH 1/9] Hexagon (target/hexagon) Add support for v68/v69/v71/v73
  2023-04-26 18:06   ` Anton Johansson via
@ 2023-04-26 20:27     ` Taylor Simpson
  0 siblings, 0 replies; 20+ messages in thread
From: Taylor Simpson @ 2023-04-26 20:27 UTC (permalink / raw
  To: anjo@rev.ng, qemu-devel@nongnu.org
  Cc: richard.henderson@linaro.org, philmd@linaro.org, ale@rev.ng,
	Brian Cain, Matheus Bernardino (QUIC)



> -----Original Message-----
> From: Anton Johansson <anjo@rev.ng>
> Sent: Wednesday, April 26, 2023 1:06 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: richard.henderson@linaro.org; philmd@linaro.org; ale@rev.ng; Brian Cain
> <bcain@quicinc.com>; Matheus Bernardino (QUIC)
> <quic_mathbern@quicinc.com>
> Subject: Re: [PATCH 1/9] Hexagon (target/hexagon) Add support for
> v68/v69/v71/v73
> 
> On 4/26/23 04:30, Taylor Simpson wrote:
> > diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c index
> > ab40cfc283..8699db8c24 100644
> > --- a/target/hexagon/cpu.c
> > +++ b/target/hexagon/cpu.c
> > @@ -29,6 +29,22 @@ static void hexagon_v67_cpu_init(Object *obj)
> >   {
> >   }
> >
> > +static void hexagon_v68_cpu_init(Object *obj) { }
> > +
> > +static void hexagon_v69_cpu_init(Object *obj) { }
> > +
> > +static void hexagon_v71_cpu_init(Object *obj) { }
> > +
> > +static void hexagon_v73_cpu_init(Object *obj) { }
> > +
> >   static ObjectClass *hexagon_cpu_class_by_name(const char
> *cpu_model)
> >   {
> >       ObjectClass *oc;
> > @@ -382,6 +398,10 @@ static const TypeInfo hexagon_cpu_type_infos[] =
> {
> >           .class_init = hexagon_cpu_class_init,
> >       },
> >       DEFINE_CPU(TYPE_HEXAGON_CPU_V67,
> hexagon_v67_cpu_init),
> > +    DEFINE_CPU(TYPE_HEXAGON_CPU_V68,
> hexagon_v68_cpu_init),
> > +    DEFINE_CPU(TYPE_HEXAGON_CPU_V69,
> hexagon_v69_cpu_init),
> > +    DEFINE_CPU(TYPE_HEXAGON_CPU_V71,
> hexagon_v71_cpu_init),
> > +    DEFINE_CPU(TYPE_HEXAGON_CPU_V73,
> hexagon_v73_cpu_init),
> 
> The large spacing to hexagon_v*_cpu_init looks a bit odd.

I'll put them each on a single line with no line in between.

> 
> Also, do we need to provide a *_cpu_init() stub for each version? Seems
> from qom/object.c like we should be able to just default initialize it

I could point them all to a single function, but at some point, we'll want to execute only the instructions that are available an the specified version of the core.

> 
> Otherwise,
> 
> Reviewed-by: Anton Johansson <anjo@rev.ng>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/9] Hexagon (target/hexagon) Add v68 scalar instructions
  2023-04-26  2:30 ` [PATCH 2/9] Hexagon (target/hexagon) Add v68 scalar instructions Taylor Simpson
@ 2023-04-27 10:36   ` Anton Johansson via
  0 siblings, 0 replies; 20+ messages in thread
From: Anton Johansson via @ 2023-04-27 10:36 UTC (permalink / raw
  To: Taylor Simpson, qemu-devel
  Cc: richard.henderson, philmd, ale, bcain, quic_mathbern


On 4/26/23 04:30, Taylor Simpson wrote:
> The following instructions are added
>      L2_loadw_aq
>      L4_loadd_aq
>      R6_release_at_vi
>      R6_release_st_vi
>      S2_storew_rl_at_vi
>      S4_stored_rl_at_vi
>      S2_storew_rl_st_vi
>      S4_stored_rl_st_vi
>
> The release instructions are nop's in qemu.  The others behave as
>   loads/stores.
>
> The encodings for these instructions changed some "don't care" bits
>      L2_loadw_locked
>      L4_loadd_locked
>      S2_storew_locked
>      S4_stored_locked
>
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>   target/hexagon/gen_tcg.h                | 18 ++++++++++++++++++
>   target/hexagon/attribs_def.h.inc        |  7 +++++++
>   target/hexagon/translate.c              |  3 +++
>   target/hexagon/gen_idef_parser_funcs.py |  2 ++
>   target/hexagon/imported/encode_pp.def   | 19 ++++++++++++++-----
>   target/hexagon/imported/ldst.idef       | 20 +++++++++++++++++++-
>   6 files changed, 63 insertions(+), 6 deletions(-)

Reviewed-by: Anton Johansson <anjo@rev.ng>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/9] Hexagon (tests/tcg/hexagon) Add v68 scalar tests
  2023-04-26  2:30 ` [PATCH 3/9] Hexagon (tests/tcg/hexagon) Add v68 scalar tests Taylor Simpson
@ 2023-04-27 13:34   ` Anton Johansson via
  0 siblings, 0 replies; 20+ messages in thread
From: Anton Johansson via @ 2023-04-27 13:34 UTC (permalink / raw
  To: Taylor Simpson, qemu-devel
  Cc: richard.henderson, philmd, ale, bcain, quic_mathbern


On 4/26/23 04:30, Taylor Simpson wrote:
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>   tests/tcg/hexagon/v68_scalar.c    | 186 ++++++++++++++++++++++++++++++
>   tests/tcg/hexagon/Makefile.target |   2 +
>   2 files changed, 188 insertions(+)
>   create mode 100644 tests/tcg/hexagon/v68_scalar.c
>
> diff --git a/tests/tcg/hexagon/v68_scalar.c b/tests/tcg/hexagon/v68_scalar.c
> new file mode 100644
> index 0000000000..7a8adb1130
> --- /dev/null
> +++ b/tests/tcg/hexagon/v68_scalar.c
> @@ -0,0 +1,186 @@
> +/*
> + *  Copyright(c) 2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or
> + *  (at your option) any later version.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License
> + *  along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <stdio.h>
> +#include <stdbool.h>
> +#include <stdint.h>
> +
> +/*
> + *  Test the scalar core instructions that are new in v68
> + */
> +
> +int err;
> +
> +static int buffer32[] = { 1, 2, 3, 4 };
> +static long long buffer64[] = { 5, 6, 7, 8 };
> +
> +static void __check32(int line, uint32_t result, uint32_t expect)
> +{
> +    if (result != expect) {
> +        printf("ERROR at line %d: 0x%08x != 0x%08x\n",
> +               line, result, expect);
> +        err++;
> +    }
> +}
> +
> +#define check32(RES, EXP) __check32(__LINE__, RES, EXP)
> +
> +static void __check64(int line, uint64_t result, uint64_t expect)
> +{
> +    if (result != expect) {
> +        printf("ERROR at line %d: 0x%016llx != 0x%016llx\n",
> +               line, result, expect);
> +        err++;
> +    }
> +}
> +
> +#define check64(RES, EXP) __check64(__LINE__, RES, EXP)

check32/check64 show up in fpstuff.c, usr.c, mem_noshuf.c, and now in 
v[68|73]_scalar.c, but in
slight variations (different arg. names/order of args.)  We should 
consider keeping a single definition
in a check_result.h header, or similar.

Otherwise,

Reviewed-by: Anton Johansson <anjo@rev.ng>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/9] Hexagon (target/hexagon) Add v68 HVX instructions
  2023-04-26  2:30 ` [PATCH 4/9] Hexagon (target/hexagon) Add v68 HVX instructions Taylor Simpson
@ 2023-04-27 13:36   ` Anton Johansson via
  0 siblings, 0 replies; 20+ messages in thread
From: Anton Johansson via @ 2023-04-27 13:36 UTC (permalink / raw
  To: Taylor Simpson, qemu-devel
  Cc: richard.henderson, philmd, ale, bcain, quic_mathbern


On 4/26/23 04:30, Taylor Simpson wrote:
> The following instructions are added
>      V6_v6mpyvubs10_vxx
>      V6_v6mpyhubs10_vxx
>      V6_v6mpyvubs10
>      V6_v6mpyhubs10
>
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>   target/hexagon/mmvec/macros.h                |   9 +-
>   target/hexagon/imported/mmvec/encode_ext.def |   8 +-
>   target/hexagon/imported/mmvec/ext.idef       | 281 ++++++++++++++++++-
>   3 files changed, 295 insertions(+), 3 deletions(-)

Reviewed-by: Anton Johansson <anjo@rev.ng>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/9] Hexagon (tests/tcg/hexagon) Add v68 HVX tests
  2023-04-26  2:30 ` [PATCH 5/9] Hexagon (tests/tcg/hexagon) Add v68 HVX tests Taylor Simpson
@ 2023-04-27 13:43   ` Anton Johansson via
  0 siblings, 0 replies; 20+ messages in thread
From: Anton Johansson via @ 2023-04-27 13:43 UTC (permalink / raw
  To: Taylor Simpson, qemu-devel
  Cc: richard.henderson, philmd, ale, bcain, quic_mathbern


On 4/26/23 04:30, Taylor Simpson wrote:
> ---
>   tests/tcg/hexagon/v6mpy_ref.h     | 161 ++++++++++++++++++++++++++++++
>   tests/tcg/hexagon/v68_hvx.c       |  90 +++++++++++++++++
>   tests/tcg/hexagon/Makefile.target |   3 +
>   3 files changed, 254 insertions(+)
>   create mode 100644 tests/tcg/hexagon/v6mpy_ref.h
>   create mode 100644 tests/tcg/hexagon/v68_hvx.c
>
> diff --git a/tests/tcg/hexagon/v6mpy_ref.h b/tests/tcg/hexagon/v6mpy_ref.h
> new file mode 100644
> index 0000000000..8258cddcb1
> --- /dev/null
> +++ b/tests/tcg/hexagon/v6mpy_ref.h

Use *.h.inc extension to match rest of codebase.

Otherwise,

Reviewed-by: Anton Johansson <anjo@rev.ng>




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/9] Hexagon (target/hexagon) Add v69 HVX instructions
  2023-04-26  2:30 ` [PATCH 6/9] Hexagon (target/hexagon) Add v69 HVX instructions Taylor Simpson
@ 2023-04-27 13:56   ` Anton Johansson via
  0 siblings, 0 replies; 20+ messages in thread
From: Anton Johansson via @ 2023-04-27 13:56 UTC (permalink / raw
  To: Taylor Simpson, qemu-devel
  Cc: richard.henderson, philmd, ale, bcain, quic_mathbern


On 4/26/23 04:30, Taylor Simpson wrote:
> The following instructions are added
>      V6_vasrvuhubrndsat
>      V6_vasrvuhubsat
>      V6_vasrvwuhrndsat
>      V6_vasrvwuhsat
>      V6_vassign_tmp
>      V6_vcombine_tmp
>      V6_vmpyuhvs
>
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>   target/hexagon/gen_tcg_hvx.h                 | 12 ++++++
>   target/hexagon/attribs_def.h.inc             |  8 ++++
>   target/hexagon/imported/mmvec/encode_ext.def |  8 ++++
>   target/hexagon/imported/mmvec/ext.idef       | 40 ++++++++++++++++++++
>   4 files changed, 68 insertions(+)

Reviewed-by: Anton Johansson <anjo@rev.ng>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/9] Hexagon (tests/tcg/hexagon) Add v69 HVX tests
  2023-04-26  2:30 ` [PATCH 7/9] Hexagon (tests/tcg/hexagon) Add v69 HVX tests Taylor Simpson
@ 2023-04-27 14:39   ` Anton Johansson via
  0 siblings, 0 replies; 20+ messages in thread
From: Anton Johansson via @ 2023-04-27 14:39 UTC (permalink / raw
  To: Taylor Simpson, qemu-devel
  Cc: richard.henderson, philmd, ale, bcain, quic_mathbern


On 4/26/23 04:30, Taylor Simpson wrote:
> The following instructions are tested
>      V6_vasrvuhubrndsat
>      V6_vasrvuhubsat
>      V6_vasrvwuhrndsat
>      V6_vasrvwuhsat
>      V6_vassign_tmp
>      V6_vcombine_tmp
>      V6_vmpyuhvs
>
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>   tests/tcg/hexagon/v69_hvx.c       | 318 ++++++++++++++++++++++++++++++
>   tests/tcg/hexagon/Makefile.target |   3 +
>   2 files changed, 321 insertions(+)
>   create mode 100644 tests/tcg/hexagon/v69_hvx.c
>
> diff --git a/tests/tcg/hexagon/v69_hvx.c b/tests/tcg/hexagon/v69_hvx.c
> new file mode 100644
> index 0000000000..051e5420df
> --- /dev/null
> +++ b/tests/tcg/hexagon/v69_hvx.c
> @@ -0,0 +1,318 @@
> +/*
> + *  Copyright(c) 2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or
> + *  (at your option) any later version.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License
> + *  along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <stdio.h>
> +#include <stdint.h>
> +#include <stdbool.h>
> +#include <string.h>
> +#include <limits.h>
> +
> +int err;
> +
> +#include "hvx_misc.h"
> +
> +#define fVROUND(VAL, SHAMT) \
> +    ((VAL) + (((SHAMT) > 0) ? (1LL << ((SHAMT) - 1)) : 0))
> +
> +#define fVSATUB(VAL) \
> +    ((((VAL) & 0xffLL) == (VAL)) ? \
> +        (VAL) : \
> +        ((((int32_t)(VAL)) < 0) ? 0 : 0xff))
> +
> +#define fVSATUH(VAL) \
> +    ((((VAL) & 0xffffLL) == (VAL)) ? \
> +        (VAL) : \
> +        ((((int32_t)(VAL)) < 0) ? 0 : 0xffff))
> +
> +static void test_vasrvuhubrndsat(void)
> +{
> +    void *p0 = buffer0;
> +    void *p1 = buffer1;
> +    void *pout = output;
> +
> +    memset(expect, 0xaa, sizeof(expect));
> +    memset(output, 0xbb, sizeof(output));
> +
> +    for (int i = 0; i < BUFSIZE / 2; i++) {
> +        asm("v4 = vmem(%0 + #0)\n\t"
> +            "v5 = vmem(%0 + #1)\n\t"
> +            "v6 = vmem(%1 + #0)\n\t"
> +            "v5.ub = vasr(v5:4.uh, v6.ub):rnd:sat\n\t"
> +            "vmem(%2) = v5\n\t"
> +            : : "r"(p0), "r"(p1), "r"(pout)
> +            : "v4", "v5", "v6", "memory");
> +        p0 += sizeof(MMVector) * 2;
> +        p1 += sizeof(MMVector);
> +        pout += sizeof(MMVector);
> +
> +        for (int j = 0; j < MAX_VEC_SIZE_BYTES / 2; j++) {
> +            int shamt;
> +            uint8_t byte0;
> +            uint8_t byte1;
> +
> +            shamt = buffer1[i].ub[2 * j + 0] & 0x7;
> +            byte0 = fVSATUB(fVROUND(buffer0[2 * i + 0].uh[j], shamt) >> shamt);
> +            shamt = buffer1[i].ub[2 * j + 1] & 0x7;
> +            byte1 = fVSATUB(fVROUND(buffer0[2 * i + 1].uh[j], shamt) >> shamt);
> +            expect[i].uh[j] = (byte1 << 8) | (byte0 & 0xff);
> +        }
> +    }
> +
> +    check_output_h(__LINE__, BUFSIZE / 2);
> +}
> +
> +static void test_vasrvuhubsat(void)
> +{
> +    void *p0 = buffer0;
> +    void *p1 = buffer1;
> +    void *pout = output;
> +
> +    memset(expect, 0xaa, sizeof(expect));
> +    memset(output, 0xbb, sizeof(output));
> +
> +    for (int i = 0; i < BUFSIZE / 2; i++) {
> +        asm("v4 = vmem(%0 + #0)\n\t"
> +            "v5 = vmem(%0 + #1)\n\t"
> +            "v6 = vmem(%1 + #0)\n\t"
> +            "v5.ub = vasr(v5:4.uh, v6.ub):sat\n\t"
> +            "vmem(%2) = v5\n\t"
> +            : : "r"(p0), "r"(p1), "r"(pout)
> +            : "v4", "v5", "v6", "memory");
> +        p0 += sizeof(MMVector) * 2;
> +        p1 += sizeof(MMVector);
> +        pout += sizeof(MMVector);
> +
> +        for (int j = 0; j < MAX_VEC_SIZE_BYTES / 2; j++) {
> +            int shamt;
> +            uint8_t byte0;
> +            uint8_t byte1;
> +
> +            shamt = buffer1[i].ub[2 * j + 0] & 0x7;
> +            byte0 = fVSATUB(buffer0[2 * i + 0].uh[j] >> shamt);
> +            shamt = buffer1[i].ub[2 * j + 1] & 0x7;
> +            byte1 = fVSATUB(buffer0[2 * i + 1].uh[j] >> shamt);
> +            expect[i].uh[j] = (byte1 << 8) | (byte0 & 0xff);
> +        }
> +    }
> +
> +    check_output_h(__LINE__, BUFSIZE / 2);
> +}
> +
> +static void test_vasrvwuhrndsat(void)
> +{
> +    void *p0 = buffer0;
> +    void *p1 = buffer1;
> +    void *pout = output;
> +
> +    memset(expect, 0xaa, sizeof(expect));
> +    memset(output, 0xbb, sizeof(output));
> +
> +    for (int i = 0; i < BUFSIZE / 2; i++) {
> +        asm("v4 = vmem(%0 + #0)\n\t"
> +            "v5 = vmem(%0 + #1)\n\t"
> +            "v6 = vmem(%1 + #0)\n\t"
> +            "v5.uh = vasr(v5:4.w, v6.uh):rnd:sat\n\t"
> +            "vmem(%2) = v5\n\t"
> +            : : "r"(p0), "r"(p1), "r"(pout)
> +            : "v4", "v5", "v6", "memory");
> +        p0 += sizeof(MMVector) * 2;
> +        p1 += sizeof(MMVector);
> +        pout += sizeof(MMVector);
> +
> +        for (int j = 0; j < MAX_VEC_SIZE_BYTES / 2; j++) {
> +            int shamt;
> +            uint16_t half0;
> +            uint16_t half1;
> +
> +            shamt = buffer1[i].uh[2 * j + 0] & 0xf;
> +            half0 = fVSATUH(fVROUND(buffer0[2 * i + 0].w[j], shamt) >> shamt);
> +            shamt = buffer1[i].uh[2 * j + 1] & 0xf;
> +            half1 = fVSATUH(fVROUND(buffer0[2 * i + 1].w[j], shamt) >> shamt);
> +            expect[i].w[j] = (half1 << 16) | (half0 & 0xffff);

I think we want MAX_VEC_SIZE_BYTES / 4 as the upper bound for this loop, 
we currently
overflow since we're accessing words.

> +        }
> +    }
> +
> +    check_output_w(__LINE__, BUFSIZE / 2);
> +}
> +
> +static void test_vasrvwuhsat(void)
> +{
> +    void *p0 = buffer0;
> +    void *p1 = buffer1;
> +    void *pout = output;
> +
> +    memset(expect, 0xaa, sizeof(expect));
> +    memset(output, 0xbb, sizeof(output));
> +
> +    for (int i = 0; i < BUFSIZE / 2; i++) {
> +        asm("v4 = vmem(%0 + #0)\n\t"
> +            "v5 = vmem(%0 + #1)\n\t"
> +            "v6 = vmem(%1 + #0)\n\t"
> +            "v5.uh = vasr(v5:4.w, v6.uh):sat\n\t"
> +            "vmem(%2) = v5\n\t"
> +            : : "r"(p0), "r"(p1), "r"(pout)
> +            : "v4", "v5", "v6", "memory");
> +        p0 += sizeof(MMVector) * 2;
> +        p1 += sizeof(MMVector);
> +        pout += sizeof(MMVector);
> +
> +        for (int j = 0; j < MAX_VEC_SIZE_BYTES / 2; j++) {
> +            int shamt;
> +            uint16_t half0;
> +            uint16_t half1;
> +
> +            shamt = buffer1[i].uh[2 * j + 0] & 0xf;
> +            half0 = fVSATUH(buffer0[2 * i + 0].w[j] >> shamt);
> +            shamt = buffer1[i].uh[2 * j + 1] & 0xf;
> +            half1 = fVSATUH(buffer0[2 * i + 1].w[j] >> shamt);
> +            expect[i].w[j] = (half1 << 16) | (half0 & 0xffff);
Same here.

Otherwise,

Reviewed-by: Anton Johansson <anjo@rev.ng>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 8/9] Hexagon (target/hexagon) Add v73 scalar instructions
  2023-04-26  2:30 ` [PATCH 8/9] Hexagon (target/hexagon) Add v73 scalar instructions Taylor Simpson
@ 2023-04-27 14:48   ` Anton Johansson via
  0 siblings, 0 replies; 20+ messages in thread
From: Anton Johansson via @ 2023-04-27 14:48 UTC (permalink / raw
  To: Taylor Simpson, qemu-devel
  Cc: richard.henderson, philmd, ale, bcain, quic_mathbern


On 4/26/23 04:30, Taylor Simpson wrote:
> The following instructions are added
>      J2_callrh
>      J2_junprh
>
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>   target/hexagon/gen_tcg.h              | 4 ++++
>   target/hexagon/attribs_def.h.inc      | 1 +
>   target/hexagon/imported/branch.idef   | 7 ++++++-
>   target/hexagon/imported/encode_pp.def | 2 ++
>   4 files changed, 13 insertions(+), 1 deletion(-)

Reviewed-by: Anton Johansson <anjo@rev.ng>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 9/9] Hexagon (tests/tcg/hexagon) Add v73 scalar tests
  2023-04-26  2:30 ` [PATCH 9/9] Hexagon (tests/tcg/hexagon) Add v73 scalar tests Taylor Simpson
@ 2023-04-27 15:02   ` Anton Johansson via
  0 siblings, 0 replies; 20+ messages in thread
From: Anton Johansson via @ 2023-04-27 15:02 UTC (permalink / raw
  To: Taylor Simpson, qemu-devel
  Cc: richard.henderson, philmd, ale, bcain, quic_mathbern


On 4/26/23 04:30, Taylor Simpson wrote:
> Tests added for the following instructions
>      J2_callrh
>      J2_jumprh
>
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>   tests/tcg/hexagon/v73_scalar.c    | 96 +++++++++++++++++++++++++++++++
>   tests/tcg/hexagon/Makefile.target |  2 +
>   2 files changed, 98 insertions(+)
>   create mode 100644 tests/tcg/hexagon/v73_scalar.c
Reviewed-by: Anton Johansson <anjo@rev.ng>


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-04-27 15:04 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-26  2:30 [PATCH 0/9] Hexagon (target/hexagon) New architecture support Taylor Simpson
2023-04-26  2:30 ` [PATCH 1/9] Hexagon (target/hexagon) Add support for v68/v69/v71/v73 Taylor Simpson
2023-04-26 18:06   ` Anton Johansson via
2023-04-26 20:27     ` Taylor Simpson
2023-04-26  2:30 ` [PATCH 2/9] Hexagon (target/hexagon) Add v68 scalar instructions Taylor Simpson
2023-04-27 10:36   ` Anton Johansson via
2023-04-26  2:30 ` [PATCH 3/9] Hexagon (tests/tcg/hexagon) Add v68 scalar tests Taylor Simpson
2023-04-27 13:34   ` Anton Johansson via
2023-04-26  2:30 ` [PATCH 4/9] Hexagon (target/hexagon) Add v68 HVX instructions Taylor Simpson
2023-04-27 13:36   ` Anton Johansson via
2023-04-26  2:30 ` [PATCH 5/9] Hexagon (tests/tcg/hexagon) Add v68 HVX tests Taylor Simpson
2023-04-27 13:43   ` Anton Johansson via
2023-04-26  2:30 ` [PATCH 6/9] Hexagon (target/hexagon) Add v69 HVX instructions Taylor Simpson
2023-04-27 13:56   ` Anton Johansson via
2023-04-26  2:30 ` [PATCH 7/9] Hexagon (tests/tcg/hexagon) Add v69 HVX tests Taylor Simpson
2023-04-27 14:39   ` Anton Johansson via
2023-04-26  2:30 ` [PATCH 8/9] Hexagon (target/hexagon) Add v73 scalar instructions Taylor Simpson
2023-04-27 14:48   ` Anton Johansson via
2023-04-26  2:30 ` [PATCH 9/9] Hexagon (tests/tcg/hexagon) Add v73 scalar tests Taylor Simpson
2023-04-27 15:02   ` Anton Johansson via

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.