Warning: Permanently added '2620:52:3:1:dead:beef:cafe:c116' (ED25519) to the list of known hosts. You can reproduce this build on your computer by running: sudo dnf install copr-rpmbuild /usr/bin/copr-rpmbuild --verbose --drop-resultdir --task-url https://copr.fedorainfracloud.org/backend/get-build-task/9640748-fedora-42-x86_64 --chroot fedora-42-x86_64 Version: 1.6 PID: 2248 Logging PID: 2250 Task: {'allow_user_ssh': False, 'appstream': False, 'background': False, 'build_id': 9640748, 'buildroot_pkgs': [], 'chroot': 'fedora-42-x86_64', 'enable_net': False, 'fedora_review': False, 'git_hash': '38a37bebc7b1ab1ef3d8eb11e3541d2494224ffd', 'git_repo': 'https://copr-dist-git.fedorainfracloud.org/git/fachep/ollama/ollama-ggml-cuda', 'isolation': 'default', 'memory_reqs': 2048, 'package_name': 'ollama-ggml-cuda', 'package_version': '0.12.3-1', 'project_dirname': 'ollama', 'project_name': 'ollama', 'project_owner': 'fachep', 'repo_priority': None, 'repos': [{'baseurl': 'https://download.copr.fedorainfracloud.org/results/fachep/ollama/fedora-42-x86_64/', 'id': 'copr_base', 'name': 'Copr repository', 'priority': None}, {'baseurl': 'https://developer.download.nvidia.cn/compute/cuda/repos/fedora42/x86_64/', 'id': 'https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64', 'name': 'Additional repo https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64'}, {'baseurl': 'https://developer.download.nvidia.cn/compute/cuda/repos/fedora41/x86_64/', 'id': 'https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64', 'name': 'Additional repo https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64'}], 'sandbox': 'fachep/ollama--fachep', 'source_json': {}, 'source_type': None, 'ssh_public_keys': None, 'storage': 0, 'submitter': 'fachep', 'tags': [], 'task_id': '9640748-fedora-42-x86_64', 'timeout': 18000, 'uses_devel_repo': False, 'with_opts': [], 'without_opts': []} Running: git clone https://copr-dist-git.fedorainfracloud.org/git/fachep/ollama/ollama-ggml-cuda /var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda --depth 500 --no-single-branch --recursive cmd: ['git', 'clone', 'https://copr-dist-git.fedorainfracloud.org/git/fachep/ollama/ollama-ggml-cuda', '/var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda', '--depth', '500', '--no-single-branch', '--recursive'] cwd: . rc: 0 stdout: stderr: Cloning into '/var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda'... Running: git checkout 38a37bebc7b1ab1ef3d8eb11e3541d2494224ffd -- cmd: ['git', 'checkout', '38a37bebc7b1ab1ef3d8eb11e3541d2494224ffd', '--'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda rc: 0 stdout: stderr: Note: switching to '38a37bebc7b1ab1ef3d8eb11e3541d2494224ffd'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by switching back to a branch. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -c with the switch command. Example: git switch -c Or undo this operation with: git switch - Turn off this advice by setting config variable advice.detachedHead to false HEAD is now at 38a37be automatic import of ollama-ggml-cuda Running: dist-git-client sources cmd: ['dist-git-client', 'sources'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda rc: 0 stdout: stderr: INFO: Reading stdout from command: git rev-parse --abbrev-ref HEAD INFO: Reading stdout from command: git rev-parse HEAD INFO: Reading sources specification file: sources INFO: Downloading v0.12.3.tar.gz INFO: Reading stdout from command: curl --help all INFO: Calling: curl -H Pragma: -o v0.12.3.tar.gz --location --connect-timeout 60 --retry 3 --retry-delay 10 --remote-time --show-error --fail --retry-all-errors https://copr-dist-git.fedorainfracloud.org/repo/pkgs/fachep/ollama/ollama-ggml-cuda/v0.12.3.tar.gz/md5/f096acee5e82596e9afd4d07ed477de2/v0.12.3.tar.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 10.5M 100 10.5M 0 0 24.3M 0 --:--:-- --:--:-- --:--:-- 24.3M INFO: Reading stdout from command: md5sum v0.12.3.tar.gz tail: /var/lib/copr-rpmbuild/main.log: file truncated Running (timeout=18000): unbuffer mock --spec /var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda/ollama-ggml-cuda.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1759428480.475249 -r /var/lib/copr-rpmbuild/results/configs/child.cfg INFO: mock.py version 6.3 starting (python version = 3.13.7, NVR = mock-6.3-1.fc42), args: /usr/libexec/mock/mock --spec /var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda/ollama-ggml-cuda.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1759428480.475249 -r /var/lib/copr-rpmbuild/results/configs/child.cfg Start(bootstrap): init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish(bootstrap): init plugins Start: init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish: init plugins INFO: Signal handler active Start: run INFO: Start(/var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda/ollama-ggml-cuda.spec) Config(fedora-42-x86_64) Start: clean chroot Finish: clean chroot Mock Version: 6.3 INFO: Mock Version: 6.3 Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-42-x86_64-bootstrap-1759428480.475249/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata INFO: Guessed host environment type: unknown INFO: Using container image: registry.fedoraproject.org/fedora:42 INFO: Pulling image: registry.fedoraproject.org/fedora:42 INFO: Tagging container image as mock-bootstrap-39efaca6-2eaa-4e95-b733-a8ff9aab69a2 INFO: Checking that a957e8565081588d951e7d03e0623a69ff0e5191d672b63a6a58f06e615c432c image matches host's architecture INFO: Copy content of container a957e8565081588d951e7d03e0623a69ff0e5191d672b63a6a58f06e615c432c to /var/lib/mock/fedora-42-x86_64-bootstrap-1759428480.475249/root INFO: mounting a957e8565081588d951e7d03e0623a69ff0e5191d672b63a6a58f06e615c432c with podman image mount INFO: image a957e8565081588d951e7d03e0623a69ff0e5191d672b63a6a58f06e615c432c as /var/lib/containers/storage/overlay/83d7cea453f45be652d7781a7bdec8b1d322b41307e4455bbbfec597db48be36/merged INFO: umounting image a957e8565081588d951e7d03e0623a69ff0e5191d672b63a6a58f06e615c432c (/var/lib/containers/storage/overlay/83d7cea453f45be652d7781a7bdec8b1d322b41307e4455bbbfec597db48be36/merged) with podman image umount INFO: Removing image mock-bootstrap-39efaca6-2eaa-4e95-b733-a8ff9aab69a2 INFO: Package manager dnf5 detected and used (fallback) INFO: Not updating bootstrap chroot, bootstrap_image_ready=True Start(bootstrap): creating root cache Finish(bootstrap): creating root cache Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-42-x86_64-1759428480.475249/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Package manager dnf5 detected and used (direct choice) INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-4.20.1-1.fc42.x86_64 rpm-sequoia-1.7.0-5.fc42.x86_64 dnf5-5.2.16.0-1.fc42.x86_64 dnf5-plugins-5.2.16.0-1.fc42.x86_64 Start: installing minimal buildroot with dnf5 Updating and loading repositories: Copr repository 100% | 3.1 KiB/s | 1.6 KiB | 00m01s Additional repo https_developer_downlo 100% | 67.7 KiB/s | 47.8 KiB | 00m01s Additional repo https_developer_downlo 100% | 141.7 KiB/s | 109.0 KiB | 00m01s fedora 100% | 11.4 MiB/s | 35.4 MiB | 00m03s updates 100% | 799.9 KiB/s | 10.3 MiB | 00m13s Repositories loaded. Package Arch Version Repository Size Installing group/module packages: bash x86_64 5.2.37-1.fc42 fedora 8.2 MiB bzip2 x86_64 1.0.8-20.fc42 fedora 99.3 KiB coreutils x86_64 9.6-6.fc42 updates 5.4 MiB cpio x86_64 2.15-4.fc42 fedora 1.1 MiB diffutils x86_64 3.12-1.fc42 updates 1.6 MiB fedora-release-common noarch 42-30 updates 20.2 KiB findutils x86_64 1:4.10.0-5.fc42 fedora 1.9 MiB gawk x86_64 5.3.1-1.fc42 fedora 1.7 MiB glibc-minimal-langpack x86_64 2.41-11.fc42 updates 0.0 B grep x86_64 3.11-10.fc42 fedora 1.0 MiB gzip x86_64 1.13-3.fc42 fedora 392.9 KiB info x86_64 7.2-3.fc42 fedora 357.9 KiB patch x86_64 2.8-1.fc42 updates 222.8 KiB redhat-rpm-config noarch 342-4.fc42 updates 185.5 KiB rpm-build x86_64 4.20.1-1.fc42 fedora 168.7 KiB sed x86_64 4.9-4.fc42 fedora 857.3 KiB shadow-utils x86_64 2:4.17.4-1.fc42 fedora 4.0 MiB tar x86_64 2:1.35-5.fc42 fedora 3.0 MiB unzip x86_64 6.0-66.fc42 fedora 390.3 KiB util-linux x86_64 2.40.4-7.fc42 fedora 3.4 MiB which x86_64 2.23-2.fc42 updates 83.5 KiB xz x86_64 1:5.8.1-2.fc42 updates 1.3 MiB Installing dependencies: add-determinism x86_64 0.6.0-1.fc42 fedora 2.5 MiB alternatives x86_64 1.33-1.fc42 updates 62.2 KiB ansible-srpm-macros noarch 1-17.1.fc42 fedora 35.7 KiB audit-libs x86_64 4.1.1-1.fc42 updates 378.8 KiB basesystem noarch 11-22.fc42 fedora 0.0 B binutils x86_64 2.44-6.fc42 updates 25.8 MiB build-reproducibility-srpm-macros noarch 0.6.0-1.fc42 fedora 735.0 B bzip2-libs x86_64 1.0.8-20.fc42 fedora 84.6 KiB ca-certificates noarch 2025.2.80_v9.0.304-1.0.fc42 updates 2.7 MiB coreutils-common x86_64 9.6-6.fc42 updates 11.1 MiB crypto-policies noarch 20250707-1.gitad370a8.fc42 updates 142.9 KiB curl x86_64 8.11.1-6.fc42 updates 450.6 KiB cyrus-sasl-lib x86_64 2.1.28-30.fc42 fedora 2.3 MiB debugedit x86_64 5.1-7.fc42 updates 192.7 KiB dwz x86_64 0.16-1.fc42 updates 287.1 KiB ed x86_64 1.21-2.fc42 fedora 146.5 KiB efi-srpm-macros noarch 6-3.fc42 updates 40.1 KiB elfutils x86_64 0.193-2.fc42 updates 2.9 MiB elfutils-debuginfod-client x86_64 0.193-2.fc42 updates 83.9 KiB elfutils-default-yama-scope noarch 0.193-2.fc42 updates 1.8 KiB elfutils-libelf x86_64 0.193-2.fc42 updates 1.2 MiB elfutils-libs x86_64 0.193-2.fc42 updates 683.4 KiB fedora-gpg-keys noarch 42-1 fedora 128.2 KiB fedora-release noarch 42-30 updates 0.0 B fedora-release-identity-basic noarch 42-30 updates 646.0 B fedora-repos noarch 42-1 fedora 4.9 KiB file x86_64 5.46-3.fc42 updates 100.2 KiB file-libs x86_64 5.46-3.fc42 updates 11.9 MiB filesystem x86_64 3.18-47.fc42 updates 112.0 B filesystem-srpm-macros noarch 3.18-47.fc42 updates 38.2 KiB fonts-srpm-macros noarch 1:2.0.5-22.fc42 updates 55.8 KiB forge-srpm-macros noarch 0.4.0-2.fc42 fedora 38.9 KiB fpc-srpm-macros noarch 1.3-14.fc42 fedora 144.0 B gdb-minimal x86_64 16.3-1.fc42 updates 13.2 MiB gdbm-libs x86_64 1:1.23-9.fc42 fedora 129.9 KiB ghc-srpm-macros noarch 1.9.2-2.fc42 fedora 779.0 B glibc x86_64 2.41-11.fc42 updates 6.6 MiB glibc-common x86_64 2.41-11.fc42 updates 1.0 MiB glibc-gconv-extra x86_64 2.41-11.fc42 updates 7.2 MiB gmp x86_64 1:6.3.0-4.fc42 fedora 811.3 KiB gnat-srpm-macros noarch 6-7.fc42 fedora 1.0 KiB gnulib-l10n noarch 20241231-1.fc42 updates 655.0 KiB go-srpm-macros noarch 3.8.0-1.fc42 updates 61.9 KiB jansson x86_64 2.14-2.fc42 fedora 93.1 KiB json-c x86_64 0.18-2.fc42 fedora 86.7 KiB kernel-srpm-macros noarch 1.0-25.fc42 fedora 1.9 KiB keyutils-libs x86_64 1.6.3-5.fc42 fedora 58.3 KiB krb5-libs x86_64 1.21.3-6.fc42 updates 2.3 MiB libacl x86_64 2.3.2-3.fc42 fedora 38.3 KiB libarchive x86_64 3.8.1-1.fc42 updates 955.2 KiB libattr x86_64 2.5.2-5.fc42 fedora 27.1 KiB libblkid x86_64 2.40.4-7.fc42 fedora 262.4 KiB libbrotli x86_64 1.1.0-6.fc42 fedora 841.3 KiB libcap x86_64 2.73-2.fc42 fedora 207.1 KiB libcap-ng x86_64 0.8.5-4.fc42 fedora 72.9 KiB libcom_err x86_64 1.47.2-3.fc42 fedora 67.1 KiB libcurl x86_64 8.11.1-6.fc42 updates 834.1 KiB libeconf x86_64 0.7.6-2.fc42 updates 64.6 KiB libevent x86_64 2.1.12-15.fc42 fedora 903.1 KiB libfdisk x86_64 2.40.4-7.fc42 fedora 372.3 KiB libffi x86_64 3.4.6-5.fc42 fedora 82.3 KiB libgcc x86_64 15.2.1-1.fc42 updates 266.6 KiB libgomp x86_64 15.2.1-1.fc42 updates 541.1 KiB libidn2 x86_64 2.3.8-1.fc42 fedora 556.5 KiB libmount x86_64 2.40.4-7.fc42 fedora 356.3 KiB libnghttp2 x86_64 1.64.0-3.fc42 fedora 170.4 KiB libpkgconf x86_64 2.3.0-2.fc42 fedora 78.1 KiB libpsl x86_64 0.21.5-5.fc42 fedora 76.4 KiB libselinux x86_64 3.8-3.fc42 updates 193.1 KiB libsemanage x86_64 3.8.1-2.fc42 updates 304.4 KiB libsepol x86_64 3.8-1.fc42 fedora 826.0 KiB libsmartcols x86_64 2.40.4-7.fc42 fedora 180.4 KiB libssh x86_64 0.11.3-1.fc42 updates 567.1 KiB libssh-config noarch 0.11.3-1.fc42 updates 277.0 B libstdc++ x86_64 15.2.1-1.fc42 updates 2.8 MiB libtasn1 x86_64 4.20.0-1.fc42 fedora 176.3 KiB libtool-ltdl x86_64 2.5.4-4.fc42 fedora 70.1 KiB libunistring x86_64 1.1-9.fc42 fedora 1.7 MiB libuuid x86_64 2.40.4-7.fc42 fedora 37.3 KiB libverto x86_64 0.3.2-10.fc42 fedora 25.4 KiB libxcrypt x86_64 4.4.38-7.fc42 updates 284.5 KiB libxml2 x86_64 2.12.10-1.fc42 fedora 1.7 MiB libzstd x86_64 1.5.7-1.fc42 fedora 807.8 KiB lua-libs x86_64 5.4.8-1.fc42 updates 280.8 KiB lua-srpm-macros noarch 1-15.fc42 fedora 1.3 KiB lz4-libs x86_64 1.10.0-2.fc42 fedora 157.4 KiB mpfr x86_64 4.2.2-1.fc42 fedora 828.8 KiB ncurses-base noarch 6.5-5.20250125.fc42 fedora 326.8 KiB ncurses-libs x86_64 6.5-5.20250125.fc42 fedora 946.3 KiB ocaml-srpm-macros noarch 10-4.fc42 fedora 1.9 KiB openblas-srpm-macros noarch 2-19.fc42 fedora 112.0 B openldap x86_64 2.6.10-1.fc42 updates 655.8 KiB openssl-libs x86_64 1:3.2.4-4.fc42 updates 7.8 MiB p11-kit x86_64 0.25.8-1.fc42 updates 2.3 MiB p11-kit-trust x86_64 0.25.8-1.fc42 updates 446.5 KiB package-notes-srpm-macros noarch 0.5-13.fc42 fedora 1.6 KiB pam-libs x86_64 1.7.0-6.fc42 updates 126.7 KiB pcre2 x86_64 10.45-1.fc42 fedora 697.7 KiB pcre2-syntax noarch 10.45-1.fc42 fedora 273.9 KiB perl-srpm-macros noarch 1-57.fc42 fedora 861.0 B pkgconf x86_64 2.3.0-2.fc42 fedora 88.5 KiB pkgconf-m4 noarch 2.3.0-2.fc42 fedora 14.4 KiB pkgconf-pkg-config x86_64 2.3.0-2.fc42 fedora 989.0 B popt x86_64 1.19-8.fc42 fedora 132.8 KiB publicsuffix-list-dafsa noarch 20250616-1.fc42 updates 69.1 KiB pyproject-srpm-macros noarch 1.18.4-1.fc42 updates 1.9 KiB python-srpm-macros noarch 3.13-5.fc42 updates 51.0 KiB qt5-srpm-macros noarch 5.15.17-1.fc42 updates 500.0 B qt6-srpm-macros noarch 6.9.2-1.fc42 updates 464.0 B readline x86_64 8.2-13.fc42 fedora 485.0 KiB rpm x86_64 4.20.1-1.fc42 fedora 3.1 MiB rpm-build-libs x86_64 4.20.1-1.fc42 fedora 206.6 KiB rpm-libs x86_64 4.20.1-1.fc42 fedora 721.8 KiB rpm-sequoia x86_64 1.7.0-5.fc42 fedora 2.4 MiB rust-srpm-macros noarch 26.4-1.fc42 updates 4.8 KiB setup noarch 2.15.0-13.fc42 fedora 720.9 KiB sqlite-libs x86_64 3.47.2-5.fc42 updates 1.5 MiB systemd-libs x86_64 257.9-2.fc42 updates 2.2 MiB systemd-standalone-sysusers x86_64 257.9-2.fc42 updates 277.3 KiB tree-sitter-srpm-macros noarch 0.1.0-8.fc42 fedora 6.5 KiB util-linux-core x86_64 2.40.4-7.fc42 fedora 1.4 MiB xxhash-libs x86_64 0.8.3-2.fc42 fedora 90.2 KiB xz-libs x86_64 1:5.8.1-2.fc42 updates 217.8 KiB zig-srpm-macros noarch 1-4.fc42 fedora 1.1 KiB zip x86_64 3.0-43.fc42 fedora 698.5 KiB zlib-ng-compat x86_64 2.2.5-2.fc42 updates 137.6 KiB zstd x86_64 1.5.7-1.fc42 fedora 1.7 MiB Installing groups: Buildsystem building group Transaction Summary: Installing: 149 packages Total size of inbound packages is 52 MiB. Need to download 52 MiB. After this operation, 178 MiB extra will be used (install 178 MiB, remove 0 B). [ 1/149] bzip2-0:1.0.8-20.fc42.x86_64 100% | 148.8 KiB/s | 52.1 KiB | 00m00s [ 2/149] cpio-0:2.15-4.fc42.x86_64 100% | 553.8 KiB/s | 294.6 KiB | 00m01s [ 3/149] grep-0:3.11-10.fc42.x86_64 100% | 1.2 MiB/s | 300.1 KiB | 00m00s [ 4/149] findutils-1:4.10.0-5.fc42.x86 100% | 1.0 MiB/s | 551.5 KiB | 00m01s [ 5/149] gzip-0:1.13-3.fc42.x86_64 100% | 718.8 KiB/s | 170.4 KiB | 00m00s [ 6/149] info-0:7.2-3.fc42.x86_64 100% | 1.0 MiB/s | 183.8 KiB | 00m00s [ 7/149] bash-0:5.2.37-1.fc42.x86_64 100% | 1.7 MiB/s | 1.8 MiB | 00m01s [ 8/149] rpm-build-0:4.20.1-1.fc42.x86 100% | 779.2 KiB/s | 81.8 KiB | 00m00s [ 9/149] sed-0:4.9-4.fc42.x86_64 100% | 1.4 MiB/s | 317.3 KiB | 00m00s [ 10/149] shadow-utils-2:4.17.4-1.fc42. 100% | 4.4 MiB/s | 1.3 MiB | 00m00s [ 11/149] unzip-0:6.0-66.fc42.x86_64 100% | 1.1 MiB/s | 184.6 KiB | 00m00s [ 12/149] tar-2:1.35-5.fc42.x86_64 100% | 2.3 MiB/s | 862.5 KiB | 00m00s [ 13/149] fedora-release-common-0:42-30 100% | 60.6 KiB/s | 24.5 KiB | 00m00s [ 14/149] gawk-0:5.3.1-1.fc42.x86_64 100% | 2.7 MiB/s | 1.1 MiB | 00m00s [ 15/149] diffutils-0:3.12-1.fc42.x86_6 100% | 310.1 KiB/s | 392.6 KiB | 00m01s [ 16/149] patch-0:2.8-1.fc42.x86_64 100% | 279.5 KiB/s | 113.5 KiB | 00m00s [ 17/149] glibc-minimal-langpack-0:2.41 100% | 106.3 KiB/s | 98.7 KiB | 00m01s [ 18/149] redhat-rpm-config-0:342-4.fc4 100% | 340.7 KiB/s | 81.1 KiB | 00m00s [ 19/149] util-linux-0:2.40.4-7.fc42.x8 100% | 4.9 MiB/s | 1.2 MiB | 00m00s [ 20/149] which-0:2.23-2.fc42.x86_64 100% | 257.7 KiB/s | 41.7 KiB | 00m00s [ 21/149] ncurses-libs-0:6.5-5.20250125 100% | 2.8 MiB/s | 335.0 KiB | 00m00s [ 22/149] bzip2-libs-0:1.0.8-20.fc42.x8 100% | 613.7 KiB/s | 43.6 KiB | 00m00s [ 23/149] pcre2-0:10.45-1.fc42.x86_64 100% | 2.5 MiB/s | 262.8 KiB | 00m00s [ 24/149] popt-0:1.19-8.fc42.x86_64 100% | 867.7 KiB/s | 65.9 KiB | 00m00s [ 25/149] coreutils-0:9.6-6.fc42.x86_64 100% | 457.5 KiB/s | 1.1 MiB | 00m03s [ 26/149] readline-0:8.2-13.fc42.x86_64 100% | 2.2 MiB/s | 215.2 KiB | 00m00s [ 27/149] rpm-build-libs-0:4.20.1-1.fc4 100% | 1.2 MiB/s | 99.7 KiB | 00m00s [ 28/149] rpm-libs-0:4.20.1-1.fc42.x86_ 100% | 2.8 MiB/s | 312.0 KiB | 00m00s [ 29/149] zstd-0:1.5.7-1.fc42.x86_64 100% | 3.5 MiB/s | 485.9 KiB | 00m00s [ 30/149] libacl-0:2.3.2-3.fc42.x86_64 100% | 333.4 KiB/s | 23.0 KiB | 00m00s [ 31/149] setup-0:2.15.0-13.fc42.noarch 100% | 1.7 MiB/s | 155.8 KiB | 00m00s [ 32/149] rpm-0:4.20.1-1.fc42.x86_64 100% | 928.0 KiB/s | 548.4 KiB | 00m01s [ 33/149] gmp-1:6.3.0-4.fc42.x86_64 100% | 2.6 MiB/s | 317.7 KiB | 00m00s [ 34/149] libattr-0:2.5.2-5.fc42.x86_64 100% | 227.8 KiB/s | 17.1 KiB | 00m00s [ 35/149] libcap-0:2.73-2.fc42.x86_64 100% | 969.0 KiB/s | 84.3 KiB | 00m00s [ 36/149] fedora-repos-0:42-1.noarch 100% | 137.7 KiB/s | 9.2 KiB | 00m00s [ 37/149] mpfr-0:4.2.2-1.fc42.x86_64 100% | 2.9 MiB/s | 345.3 KiB | 00m00s [ 38/149] xz-1:5.8.1-2.fc42.x86_64 100% | 333.9 KiB/s | 572.6 KiB | 00m02s [ 39/149] ed-0:1.21-2.fc42.x86_64 100% | 1.0 MiB/s | 82.0 KiB | 00m00s [ 40/149] ansible-srpm-macros-0:1-17.1. 100% | 298.7 KiB/s | 20.3 KiB | 00m00s [ 41/149] build-reproducibility-srpm-ma 100% | 174.4 KiB/s | 11.7 KiB | 00m00s [ 42/149] forge-srpm-macros-0:0.4.0-2.f 100% | 291.9 KiB/s | 19.9 KiB | 00m00s [ 43/149] fpc-srpm-macros-0:1.3-14.fc42 100% | 121.5 KiB/s | 8.0 KiB | 00m00s [ 44/149] ghc-srpm-macros-0:1.9.2-2.fc4 100% | 136.7 KiB/s | 9.2 KiB | 00m00s [ 45/149] gnat-srpm-macros-0:6-7.fc42.n 100% | 128.5 KiB/s | 8.6 KiB | 00m00s [ 46/149] kernel-srpm-macros-0:1.0-25.f 100% | 143.1 KiB/s | 9.9 KiB | 00m00s [ 47/149] lua-srpm-macros-0:1-15.fc42.n 100% | 125.6 KiB/s | 8.9 KiB | 00m00s [ 48/149] glibc-common-0:2.41-11.fc42.x 100% | 397.5 KiB/s | 385.6 KiB | 00m01s [ 49/149] ocaml-srpm-macros-0:10-4.fc42 100% | 124.4 KiB/s | 9.2 KiB | 00m00s [ 50/149] openblas-srpm-macros-0:2-19.f 100% | 110.9 KiB/s | 7.8 KiB | 00m00s [ 51/149] package-notes-srpm-macros-0:0 100% | 138.2 KiB/s | 9.3 KiB | 00m00s [ 52/149] perl-srpm-macros-0:1-57.fc42. 100% | 119.8 KiB/s | 8.5 KiB | 00m00s [ 53/149] tree-sitter-srpm-macros-0:0.1 100% | 167.6 KiB/s | 11.2 KiB | 00m00s [ 54/149] zig-srpm-macros-0:1-4.fc42.no 100% | 124.9 KiB/s | 8.2 KiB | 00m00s [ 55/149] libblkid-0:2.40.4-7.fc42.x86_ 100% | 1.4 MiB/s | 122.5 KiB | 00m00s [ 56/149] zip-0:3.0-43.fc42.x86_64 100% | 2.5 MiB/s | 263.5 KiB | 00m00s [ 57/149] libcap-ng-0:0.8.5-4.fc42.x86_ 100% | 466.2 KiB/s | 32.2 KiB | 00m00s [ 58/149] libfdisk-0:2.40.4-7.fc42.x86_ 100% | 1.8 MiB/s | 158.5 KiB | 00m00s [ 59/149] libmount-0:2.40.4-7.fc42.x86_ 100% | 1.8 MiB/s | 155.1 KiB | 00m00s [ 60/149] libsmartcols-0:2.40.4-7.fc42. 100% | 1.0 MiB/s | 81.2 KiB | 00m00s [ 61/149] libuuid-0:2.40.4-7.fc42.x86_6 100% | 367.2 KiB/s | 25.3 KiB | 00m00s [ 62/149] util-linux-core-0:2.40.4-7.fc 100% | 3.7 MiB/s | 529.2 KiB | 00m00s [ 63/149] ncurses-base-0:6.5-5.20250125 100% | 1.1 MiB/s | 88.1 KiB | 00m00s [ 64/149] pcre2-syntax-0:10.45-1.fc42.n 100% | 1.8 MiB/s | 161.7 KiB | 00m00s [ 65/149] xz-libs-1:5.8.1-2.fc42.x86_64 100% | 357.6 KiB/s | 113.0 KiB | 00m00s [ 66/149] libzstd-0:1.5.7-1.fc42.x86_64 100% | 2.8 MiB/s | 314.8 KiB | 00m00s [ 67/149] lz4-libs-0:1.10.0-2.fc42.x86_ 100% | 1.0 MiB/s | 78.1 KiB | 00m00s [ 68/149] rpm-sequoia-0:1.7.0-5.fc42.x8 100% | 4.2 MiB/s | 911.1 KiB | 00m00s [ 69/149] fedora-gpg-keys-0:42-1.noarch 100% | 1.6 MiB/s | 135.6 KiB | 00m00s [ 70/149] gnulib-l10n-0:20241231-1.fc42 100% | 475.0 KiB/s | 150.1 KiB | 00m00s [ 71/149] add-determinism-0:0.6.0-1.fc4 100% | 4.9 MiB/s | 918.3 KiB | 00m00s [ 72/149] coreutils-common-0:9.6-6.fc42 100% | 783.9 KiB/s | 2.1 MiB | 00m03s [ 73/149] basesystem-0:11-22.fc42.noarc 100% | 102.7 KiB/s | 7.3 KiB | 00m00s [ 74/149] dwz-0:0.16-1.fc42.x86_64 100% | 489.3 KiB/s | 135.5 KiB | 00m00s [ 75/149] efi-srpm-macros-0:6-3.fc42.no 100% | 284.7 KiB/s | 22.5 KiB | 00m00s [ 76/149] file-0:5.46-3.fc42.x86_64 100% | 298.5 KiB/s | 48.6 KiB | 00m00s [ 77/149] file-libs-0:5.46-3.fc42.x86_6 100% | 816.8 KiB/s | 849.5 KiB | 00m01s [ 78/149] filesystem-srpm-macros-0:3.18 100% | 318.0 KiB/s | 26.1 KiB | 00m00s [ 79/149] fonts-srpm-macros-1:2.0.5-22. 100% | 299.0 KiB/s | 27.2 KiB | 00m00s [ 80/149] go-srpm-macros-0:3.8.0-1.fc42 100% | 341.0 KiB/s | 28.3 KiB | 00m00s [ 81/149] pyproject-srpm-macros-0:1.18. 100% | 165.4 KiB/s | 13.7 KiB | 00m00s [ 82/149] glibc-gconv-extra-0:2.41-11.f 100% | 763.1 KiB/s | 1.6 MiB | 00m02s [ 83/149] python-srpm-macros-0:3.13-5.f 100% | 249.6 KiB/s | 22.5 KiB | 00m00s [ 84/149] qt5-srpm-macros-0:5.15.17-1.f 100% | 95.8 KiB/s | 8.7 KiB | 00m00s [ 85/149] qt6-srpm-macros-0:6.9.2-1.fc4 100% | 104.2 KiB/s | 9.4 KiB | 00m00s [ 86/149] rust-srpm-macros-0:26.4-1.fc4 100% | 106.6 KiB/s | 11.2 KiB | 00m00s [ 87/149] libgcc-0:15.2.1-1.fc42.x86_64 100% | 491.0 KiB/s | 131.6 KiB | 00m00s [ 88/149] zlib-ng-compat-0:2.2.5-2.fc42 100% | 167.8 KiB/s | 79.2 KiB | 00m00s [ 89/149] glibc-0:2.41-11.fc42.x86_64 100% | 699.3 KiB/s | 2.2 MiB | 00m03s [ 90/149] elfutils-libelf-0:0.193-2.fc4 100% | 637.4 KiB/s | 207.8 KiB | 00m00s [ 91/149] elfutils-libs-0:0.193-2.fc42. 100% | 678.9 KiB/s | 270.2 KiB | 00m00s [ 92/149] elfutils-debuginfod-client-0: 100% | 295.2 KiB/s | 46.9 KiB | 00m00s [ 93/149] filesystem-0:3.18-47.fc42.x86 100% | 805.0 KiB/s | 1.3 MiB | 00m02s [ 94/149] json-c-0:0.18-2.fc42.x86_64 100% | 168.2 KiB/s | 44.9 KiB | 00m00s [ 95/149] libselinux-0:3.8-3.fc42.x86_6 100% | 608.0 KiB/s | 96.7 KiB | 00m00s [ 96/149] elfutils-0:0.193-2.fc42.x86_6 100% | 712.5 KiB/s | 571.4 KiB | 00m01s [ 97/149] libsepol-0:3.8-1.fc42.x86_64 100% | 1.2 MiB/s | 348.9 KiB | 00m00s [ 98/149] libxcrypt-0:4.4.38-7.fc42.x86 100% | 530.1 KiB/s | 127.2 KiB | 00m00s [ 99/149] audit-libs-0:4.1.1-1.fc42.x86 100% | 574.7 KiB/s | 138.5 KiB | 00m00s [100/149] pam-libs-0:1.7.0-6.fc42.x86_6 100% | 340.4 KiB/s | 57.5 KiB | 00m00s [101/149] libeconf-0:0.7.6-2.fc42.x86_6 100% | 439.6 KiB/s | 35.2 KiB | 00m00s [102/149] systemd-libs-0:257.9-2.fc42.x 100% | 846.8 KiB/s | 810.3 KiB | 00m01s [103/149] libsemanage-0:3.8.1-2.fc42.x8 100% | 765.5 KiB/s | 123.2 KiB | 00m00s [104/149] libstdc++-0:15.2.1-1.fc42.x86 100% | 826.9 KiB/s | 917.8 KiB | 00m01s [105/149] lua-libs-0:5.4.8-1.fc42.x86_6 100% | 547.4 KiB/s | 131.9 KiB | 00m00s [106/149] libgomp-0:15.2.1-1.fc42.x86_6 100% | 774.2 KiB/s | 371.6 KiB | 00m00s [107/149] sqlite-libs-0:3.47.2-5.fc42.x 100% | 785.2 KiB/s | 753.8 KiB | 00m01s [108/149] jansson-0:2.14-2.fc42.x86_64 100% | 168.1 KiB/s | 45.7 KiB | 00m00s [109/149] debugedit-0:5.1-7.fc42.x86_64 100% | 492.4 KiB/s | 78.8 KiB | 00m00s [110/149] libarchive-0:3.8.1-1.fc42.x86 100% | 518.5 KiB/s | 421.6 KiB | 00m01s [111/149] libxml2-0:2.12.10-1.fc42.x86_ 100% | 1.2 MiB/s | 683.7 KiB | 00m01s [112/149] pkgconf-pkg-config-0:2.3.0-2. 100% | 141.8 KiB/s | 9.9 KiB | 00m00s [113/149] pkgconf-0:2.3.0-2.fc42.x86_64 100% | 582.9 KiB/s | 44.9 KiB | 00m00s [114/149] pkgconf-m4-0:2.3.0-2.fc42.noa 100% | 189.8 KiB/s | 14.2 KiB | 00m00s [115/149] libpkgconf-0:2.3.0-2.fc42.x86 100% | 485.7 KiB/s | 38.4 KiB | 00m00s [116/149] openssl-libs-1:3.2.4-4.fc42.x 100% | 655.7 KiB/s | 2.3 MiB | 00m04s [117/149] curl-0:8.11.1-6.fc42.x86_64 100% | 462.2 KiB/s | 220.0 KiB | 00m00s [118/149] crypto-policies-0:20250707-1. 100% | 592.4 KiB/s | 96.0 KiB | 00m00s [119/149] elfutils-default-yama-scope-0 100% | 157.3 KiB/s | 12.6 KiB | 00m00s [120/149] libffi-0:3.4.6-5.fc42.x86_64 100% | 578.7 KiB/s | 39.9 KiB | 00m00s [121/149] p11-kit-0:0.25.8-1.fc42.x86_6 100% | 273.4 KiB/s | 503.5 KiB | 00m02s [122/149] libtasn1-0:4.20.0-1.fc42.x86_ 100% | 1.0 MiB/s | 75.0 KiB | 00m00s [123/149] p11-kit-trust-0:0.25.8-1.fc42 100% | 282.9 KiB/s | 139.2 KiB | 00m00s [124/149] alternatives-0:1.33-1.fc42.x8 100% | 248.7 KiB/s | 40.5 KiB | 00m00s [125/149] fedora-release-0:42-30.noarch 100% | 169.0 KiB/s | 13.5 KiB | 00m00s [126/149] ca-certificates-0:2025.2.80_v 100% | 312.6 KiB/s | 973.5 KiB | 00m03s [127/149] fedora-release-identity-basic 100% | 172.2 KiB/s | 14.3 KiB | 00m00s [128/149] libbrotli-0:1.1.0-6.fc42.x86_ 100% | 3.2 MiB/s | 339.8 KiB | 00m00s [129/149] libidn2-0:2.3.8-1.fc42.x86_64 100% | 2.1 MiB/s | 174.8 KiB | 00m00s [130/149] libnghttp2-0:1.64.0-3.fc42.x8 100% | 1.0 MiB/s | 77.7 KiB | 00m00s [131/149] libpsl-0:0.21.5-5.fc42.x86_64 100% | 902.0 KiB/s | 64.0 KiB | 00m00s [132/149] libcurl-0:8.11.1-6.fc42.x86_6 100% | 356.7 KiB/s | 371.7 KiB | 00m01s [133/149] libssh-0:0.11.3-1.fc42.x86_64 100% | 316.6 KiB/s | 233.0 KiB | 00m01s [134/149] libunistring-0:1.1-9.fc42.x86 100% | 4.3 MiB/s | 542.5 KiB | 00m00s [135/149] libssh-config-0:0.11.3-1.fc42 100% | 103.5 KiB/s | 9.1 KiB | 00m00s [136/149] xxhash-libs-0:0.8.3-2.fc42.x8 100% | 574.5 KiB/s | 39.1 KiB | 00m00s [137/149] systemd-standalone-sysusers-0 100% | 486.9 KiB/s | 154.8 KiB | 00m00s [138/149] publicsuffix-list-dafsa-0:202 100% | 344.0 KiB/s | 59.2 KiB | 00m00s [139/149] krb5-libs-0:1.21.3-6.fc42.x86 100% | 789.8 KiB/s | 759.8 KiB | 00m01s [140/149] keyutils-libs-0:1.6.3-5.fc42. 100% | 470.7 KiB/s | 31.5 KiB | 00m00s [141/149] libcom_err-0:1.47.2-3.fc42.x8 100% | 401.9 KiB/s | 26.9 KiB | 00m00s [142/149] libverto-0:0.3.2-10.fc42.x86_ 100% | 310.5 KiB/s | 20.8 KiB | 00m00s [143/149] openldap-0:2.6.10-1.fc42.x86_ 100% | 289.9 KiB/s | 258.6 KiB | 00m01s [144/149] binutils-0:2.44-6.fc42.x86_64 100% | 600.1 KiB/s | 5.8 MiB | 00m10s [145/149] cyrus-sasl-lib-0:2.1.28-30.fc 100% | 5.9 MiB/s | 793.5 KiB | 00m00s [146/149] libtool-ltdl-0:2.5.4-4.fc42.x 100% | 539.9 KiB/s | 36.2 KiB | 00m00s [147/149] gdbm-libs-1:1.23-9.fc42.x86_6 100% | 838.6 KiB/s | 57.0 KiB | 00m00s [148/149] libevent-0:2.1.12-15.fc42.x86 100% | 579.5 KiB/s | 260.2 KiB | 00m00s [149/149] gdb-minimal-0:16.3-1.fc42.x86 100% | 832.7 KiB/s | 4.4 MiB | 00m05s -------------------------------------------------------------------------------- [149/149] Total 100% | 2.0 MiB/s | 52.4 MiB | 00m26s Running transaction Importing OpenPGP key 0x105EF944: UserID : "Fedora (42) " Fingerprint: B0F4950458F69E1150C6C5EDC8AC4916105EF944 From : file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-42-primary The key was successfully imported. [ 1/151] Verify package files 100% | 726.0 B/s | 149.0 B | 00m00s [ 2/151] Prepare transaction 100% | 1.9 KiB/s | 149.0 B | 00m00s [ 3/151] Installing libgcc-0:15.2.1-1. 100% | 130.9 MiB/s | 268.2 KiB | 00m00s [ 4/151] Installing publicsuffix-list- 100% | 68.2 MiB/s | 69.8 KiB | 00m00s [ 5/151] Installing libssh-config-0:0. 100% | 0.0 B/s | 816.0 B | 00m00s [ 6/151] Installing fedora-release-ide 100% | 882.8 KiB/s | 904.0 B | 00m00s [ 7/151] Installing fedora-gpg-keys-0: 100% | 19.0 MiB/s | 174.8 KiB | 00m00s [ 8/151] Installing fedora-repos-0:42- 100% | 0.0 B/s | 5.7 KiB | 00m00s [ 9/151] Installing fedora-release-com 100% | 12.0 MiB/s | 24.5 KiB | 00m00s [ 10/151] Installing fedora-release-0:4 100% | 3.6 KiB/s | 124.0 B | 00m00s >>> Running sysusers scriptlet: setup-0:2.15.0-13.fc42.noarch >>> Finished sysusers scriptlet: setup-0:2.15.0-13.fc42.noarch >>> Scriptlet output: >>> Creating group 'adm' with GID 4. >>> Creating group 'audio' with GID 63. >>> Creating group 'bin' with GID 1. >>> Creating group 'cdrom' with GID 11. >>> Creating group 'clock' with GID 103. >>> Creating group 'daemon' with GID 2. >>> Creating group 'dialout' with GID 18. >>> Creating group 'disk' with GID 6. >>> Creating group 'floppy' with GID 19. >>> Creating group 'ftp' with GID 50. >>> Creating group 'games' with GID 20. >>> Creating group 'input' with GID 104. >>> Creating group 'kmem' with GID 9. >>> Creating group 'kvm' with GID 36. >>> Creating group 'lock' with GID 54. >>> Creating group 'lp' with GID 7. >>> Creating group 'mail' with GID 12. >>> Creating group 'man' with GID 15. >>> Creating group 'mem' with GID 8. >>> Creating group 'nobody' with GID 65534. >>> Creating group 'render' with GID 105. >>> Creating group 'root' with GID 0. >>> Creating group 'sgx' with GID 106. >>> Creating group 'sys' with GID 3. >>> Creating group 'tape' with GID 33. >>> Creating group 'tty' with GID 5. >>> Creating group 'users' with GID 100. >>> Creating group 'utmp' with GID 22. >>> Creating group 'video' with GID 39. >>> Creating group 'wheel' with GID 10. >>> >>> Running sysusers scriptlet: setup-0:2.15.0-13.fc42.noarch >>> Finished sysusers scriptlet: setup-0:2.15.0-13.fc42.noarch >>> Scriptlet output: >>> Creating user 'adm' (adm) with UID 3 and GID 4. >>> Creating user 'bin' (bin) with UID 1 and GID 1. >>> Creating user 'daemon' (daemon) with UID 2 and GID 2. >>> Creating user 'ftp' (FTP User) with UID 14 and GID 50. >>> Creating user 'games' (games) with UID 12 and GID 20. >>> Creating user 'halt' (halt) with UID 7 and GID 0. >>> Creating user 'lp' (lp) with UID 4 and GID 7. >>> Creating user 'mail' (mail) with UID 8 and GID 12. >>> Creating user 'nobody' (Kernel Overflow User) with UID 65534 and GID 65534. >>> Creating user 'operator' (operator) with UID 11 and GID 0. >>> Creating user 'root' (Super User) with UID 0 and GID 0. >>> Creating user 'shutdown' (shutdown) with UID 6 and GID 0. >>> Creating user 'sync' (sync) with UID 5 and GID 0. >>> [ 11/151] Installing setup-0:2.15.0-13. 100% | 39.4 MiB/s | 726.7 KiB | 00m00s >>> [RPM] /etc/hosts created as /etc/hosts.rpmnew [ 12/151] Installing filesystem-0:3.18- 100% | 1.4 MiB/s | 212.8 KiB | 00m00s [ 13/151] Installing basesystem-0:11-22 100% | 0.0 B/s | 124.0 B | 00m00s [ 14/151] Installing pkgconf-m4-0:2.3.0 100% | 0.0 B/s | 14.8 KiB | 00m00s [ 15/151] Installing rust-srpm-macros-0 100% | 0.0 B/s | 5.6 KiB | 00m00s [ 16/151] Installing qt6-srpm-macros-0: 100% | 0.0 B/s | 740.0 B | 00m00s [ 17/151] Installing qt5-srpm-macros-0: 100% | 0.0 B/s | 776.0 B | 00m00s [ 18/151] Installing gnulib-l10n-0:2024 100% | 107.7 MiB/s | 661.9 KiB | 00m00s [ 19/151] Installing coreutils-common-0 100% | 242.5 MiB/s | 11.2 MiB | 00m00s [ 20/151] Installing pcre2-syntax-0:10. 100% | 135.0 MiB/s | 276.4 KiB | 00m00s [ 21/151] Installing ncurses-base-0:6.5 100% | 38.2 MiB/s | 352.2 KiB | 00m00s [ 22/151] Installing glibc-minimal-lang 100% | 0.0 B/s | 124.0 B | 00m00s [ 23/151] Installing ncurses-libs-0:6.5 100% | 155.1 MiB/s | 952.8 KiB | 00m00s [ 24/151] Installing glibc-0:2.41-11.fc 100% | 154.7 MiB/s | 6.7 MiB | 00m00s [ 25/151] Installing bash-0:5.2.37-1.fc 100% | 199.3 MiB/s | 8.2 MiB | 00m00s [ 26/151] Installing glibc-common-0:2.4 100% | 51.0 MiB/s | 1.0 MiB | 00m00s [ 27/151] Installing glibc-gconv-extra- 100% | 143.3 MiB/s | 7.3 MiB | 00m00s [ 28/151] Installing zlib-ng-compat-0:2 100% | 135.2 MiB/s | 138.4 KiB | 00m00s [ 29/151] Installing bzip2-libs-0:1.0.8 100% | 83.7 MiB/s | 85.7 KiB | 00m00s [ 30/151] Installing xz-libs-1:5.8.1-2. 100% | 213.8 MiB/s | 218.9 KiB | 00m00s [ 31/151] Installing libuuid-0:2.40.4-7 100% | 37.5 MiB/s | 38.4 KiB | 00m00s [ 32/151] Installing libblkid-0:2.40.4- 100% | 128.7 MiB/s | 263.5 KiB | 00m00s [ 33/151] Installing popt-0:1.19-8.fc42 100% | 27.2 MiB/s | 139.4 KiB | 00m00s [ 34/151] Installing readline-0:8.2-13. 100% | 237.9 MiB/s | 487.1 KiB | 00m00s [ 35/151] Installing gmp-1:6.3.0-4.fc42 100% | 264.8 MiB/s | 813.5 KiB | 00m00s [ 36/151] Installing libzstd-0:1.5.7-1. 100% | 263.4 MiB/s | 809.1 KiB | 00m00s [ 37/151] Installing elfutils-libelf-0: 100% | 233.3 MiB/s | 1.2 MiB | 00m00s [ 38/151] Installing libstdc++-0:15.2.1 100% | 257.8 MiB/s | 2.8 MiB | 00m00s [ 39/151] Installing libxcrypt-0:4.4.38 100% | 140.2 MiB/s | 287.2 KiB | 00m00s [ 40/151] Installing libattr-0:2.5.2-5. 100% | 27.4 MiB/s | 28.1 KiB | 00m00s [ 41/151] Installing libacl-0:2.3.2-3.f 100% | 38.2 MiB/s | 39.2 KiB | 00m00s [ 42/151] Installing dwz-0:0.16-1.fc42. 100% | 20.1 MiB/s | 288.5 KiB | 00m00s [ 43/151] Installing mpfr-0:4.2.2-1.fc4 100% | 202.7 MiB/s | 830.4 KiB | 00m00s [ 44/151] Installing gawk-0:5.3.1-1.fc4 100% | 77.0 MiB/s | 1.7 MiB | 00m00s [ 45/151] Installing unzip-0:6.0-66.fc4 100% | 25.6 MiB/s | 393.8 KiB | 00m00s [ 46/151] Installing file-libs-0:5.46-3 100% | 494.1 MiB/s | 11.9 MiB | 00m00s [ 47/151] Installing file-0:5.46-3.fc42 100% | 3.5 MiB/s | 101.7 KiB | 00m00s [ 48/151] Installing crypto-policies-0: 100% | 16.4 MiB/s | 167.8 KiB | 00m00s [ 49/151] Installing pcre2-0:10.45-1.fc 100% | 227.6 MiB/s | 699.1 KiB | 00m00s [ 50/151] Installing grep-0:3.11-10.fc4 100% | 45.6 MiB/s | 1.0 MiB | 00m00s [ 51/151] Installing xz-1:5.8.1-2.fc42. 100% | 57.9 MiB/s | 1.3 MiB | 00m00s [ 52/151] Installing libcap-ng-0:0.8.5- 100% | 73.1 MiB/s | 74.8 KiB | 00m00s [ 53/151] Installing audit-libs-0:4.1.1 100% | 124.2 MiB/s | 381.5 KiB | 00m00s [ 54/151] Installing libsmartcols-0:2.4 100% | 177.3 MiB/s | 181.5 KiB | 00m00s [ 55/151] Installing lz4-libs-0:1.10.0- 100% | 154.7 MiB/s | 158.5 KiB | 00m00s [ 56/151] Installing libsepol-0:3.8-1.f 100% | 269.2 MiB/s | 827.0 KiB | 00m00s [ 57/151] Installing libselinux-0:3.8-3 100% | 94.9 MiB/s | 194.3 KiB | 00m00s [ 58/151] Installing findutils-1:4.10.0 100% | 81.5 MiB/s | 1.9 MiB | 00m00s [ 59/151] Installing sed-0:4.9-4.fc42.x 100% | 44.5 MiB/s | 865.5 KiB | 00m00s [ 60/151] Installing libmount-0:2.40.4- 100% | 174.5 MiB/s | 357.3 KiB | 00m00s [ 61/151] Installing libeconf-0:0.7.6-2 100% | 64.7 MiB/s | 66.2 KiB | 00m00s [ 62/151] Installing pam-libs-0:1.7.0-6 100% | 63.0 MiB/s | 129.1 KiB | 00m00s [ 63/151] Installing libcap-0:2.73-2.fc 100% | 13.8 MiB/s | 212.1 KiB | 00m00s [ 64/151] Installing systemd-libs-0:257 100% | 248.0 MiB/s | 2.2 MiB | 00m00s [ 65/151] Installing lua-libs-0:5.4.8-1 100% | 137.7 MiB/s | 282.0 KiB | 00m00s [ 66/151] Installing libffi-0:3.4.6-5.f 100% | 81.7 MiB/s | 83.7 KiB | 00m00s [ 67/151] Installing libtasn1-0:4.20.0- 100% | 87.0 MiB/s | 178.1 KiB | 00m00s [ 68/151] Installing p11-kit-0:0.25.8-1 100% | 84.8 MiB/s | 2.3 MiB | 00m00s [ 69/151] Installing alternatives-0:1.3 100% | 4.8 MiB/s | 63.8 KiB | 00m00s [ 70/151] Installing libunistring-0:1.1 100% | 246.7 MiB/s | 1.7 MiB | 00m00s [ 71/151] Installing libidn2-0:2.3.8-1. 100% | 91.6 MiB/s | 562.7 KiB | 00m00s [ 72/151] Installing libpsl-0:0.21.5-5. 100% | 75.7 MiB/s | 77.5 KiB | 00m00s [ 73/151] Installing p11-kit-trust-0:0. 100% | 14.6 MiB/s | 448.3 KiB | 00m00s [ 74/151] Installing openssl-libs-1:3.2 100% | 279.3 MiB/s | 7.8 MiB | 00m00s [ 75/151] Installing coreutils-0:9.6-6. 100% | 104.8 MiB/s | 5.5 MiB | 00m00s [ 76/151] Installing ca-certificates-0: 100% | 1.2 MiB/s | 2.5 MiB | 00m02s [ 77/151] Installing gzip-0:1.13-3.fc42 100% | 22.9 MiB/s | 398.4 KiB | 00m00s [ 78/151] Installing rpm-sequoia-0:1.7. 100% | 268.3 MiB/s | 2.4 MiB | 00m00s [ 79/151] Installing libevent-0:2.1.12- 100% | 177.1 MiB/s | 906.9 KiB | 00m00s [ 80/151] Installing util-linux-core-0: 100% | 62.0 MiB/s | 1.4 MiB | 00m00s [ 81/151] Installing systemd-standalone 100% | 19.4 MiB/s | 277.8 KiB | 00m00s [ 82/151] Installing tar-2:1.35-5.fc42. 100% | 113.9 MiB/s | 3.0 MiB | 00m00s [ 83/151] Installing libsemanage-0:3.8. 100% | 99.7 MiB/s | 306.2 KiB | 00m00s [ 84/151] Installing shadow-utils-2:4.1 100% | 86.0 MiB/s | 4.0 MiB | 00m00s [ 85/151] Installing zstd-0:1.5.7-1.fc4 100% | 90.0 MiB/s | 1.7 MiB | 00m00s [ 86/151] Installing zip-0:3.0-43.fc42. 100% | 42.9 MiB/s | 702.4 KiB | 00m00s [ 87/151] Installing libfdisk-0:2.40.4- 100% | 182.3 MiB/s | 373.4 KiB | 00m00s [ 88/151] Installing libxml2-0:2.12.10- 100% | 89.3 MiB/s | 1.7 MiB | 00m00s [ 89/151] Installing libarchive-0:3.8.1 100% | 186.9 MiB/s | 957.1 KiB | 00m00s [ 90/151] Installing bzip2-0:1.0.8-20.f 100% | 5.6 MiB/s | 103.8 KiB | 00m00s [ 91/151] Installing add-determinism-0: 100% | 117.4 MiB/s | 2.5 MiB | 00m00s [ 92/151] Installing build-reproducibil 100% | 1.0 MiB/s | 1.0 KiB | 00m00s [ 93/151] Installing sqlite-libs-0:3.47 100% | 252.1 MiB/s | 1.5 MiB | 00m00s [ 94/151] Installing rpm-libs-0:4.20.1- 100% | 176.6 MiB/s | 723.4 KiB | 00m00s [ 95/151] Installing ed-0:1.21-2.fc42.x 100% | 10.4 MiB/s | 148.8 KiB | 00m00s [ 96/151] Installing patch-0:2.8-1.fc42 100% | 15.6 MiB/s | 224.3 KiB | 00m00s [ 97/151] Installing filesystem-srpm-ma 100% | 38.0 MiB/s | 38.9 KiB | 00m00s [ 98/151] Installing elfutils-default-y 100% | 145.9 KiB/s | 2.0 KiB | 00m00s [ 99/151] Installing elfutils-libs-0:0. 100% | 167.3 MiB/s | 685.2 KiB | 00m00s [100/151] Installing cpio-0:2.15-4.fc42 100% | 52.4 MiB/s | 1.1 MiB | 00m00s [101/151] Installing diffutils-0:3.12-1 100% | 71.0 MiB/s | 1.6 MiB | 00m00s [102/151] Installing json-c-0:0.18-2.fc 100% | 43.0 MiB/s | 88.0 KiB | 00m00s [103/151] Installing libgomp-0:15.2.1-1 100% | 176.6 MiB/s | 542.5 KiB | 00m00s [104/151] Installing rpm-build-libs-0:4 100% | 101.3 MiB/s | 207.4 KiB | 00m00s [105/151] Installing jansson-0:2.14-2.f 100% | 92.2 MiB/s | 94.4 KiB | 00m00s [106/151] Installing libpkgconf-0:2.3.0 100% | 77.4 MiB/s | 79.2 KiB | 00m00s [107/151] Installing pkgconf-0:2.3.0-2. 100% | 6.3 MiB/s | 91.0 KiB | 00m00s [108/151] Installing pkgconf-pkg-config 100% | 136.4 KiB/s | 1.8 KiB | 00m00s [109/151] Installing libbrotli-0:1.1.0- 100% | 205.9 MiB/s | 843.6 KiB | 00m00s [110/151] Installing libnghttp2-0:1.64. 100% | 83.7 MiB/s | 171.5 KiB | 00m00s [111/151] Installing xxhash-libs-0:0.8. 100% | 89.4 MiB/s | 91.6 KiB | 00m00s [112/151] Installing keyutils-libs-0:1. 100% | 58.3 MiB/s | 59.7 KiB | 00m00s [113/151] Installing libcom_err-0:1.47. 100% | 66.6 MiB/s | 68.2 KiB | 00m00s [114/151] Installing libverto-0:0.3.2-1 100% | 26.6 MiB/s | 27.2 KiB | 00m00s [115/151] Installing krb5-libs-0:1.21.3 100% | 191.0 MiB/s | 2.3 MiB | 00m00s [116/151] Installing libssh-0:0.11.3-1. 100% | 185.3 MiB/s | 569.2 KiB | 00m00s [117/151] Installing libtool-ltdl-0:2.5 100% | 69.6 MiB/s | 71.2 KiB | 00m00s [118/151] Installing gdbm-libs-1:1.23-9 100% | 64.2 MiB/s | 131.6 KiB | 00m00s [119/151] Installing cyrus-sasl-lib-0:2 100% | 104.7 MiB/s | 2.3 MiB | 00m00s [120/151] Installing openldap-0:2.6.10- 100% | 128.8 MiB/s | 659.6 KiB | 00m00s [121/151] Installing libcurl-0:8.11.1-6 100% | 203.9 MiB/s | 835.2 KiB | 00m00s [122/151] Installing elfutils-debuginfo 100% | 6.0 MiB/s | 86.2 KiB | 00m00s [123/151] Installing elfutils-0:0.193-2 100% | 121.8 MiB/s | 2.9 MiB | 00m00s [124/151] Installing binutils-0:2.44-6. 100% | 228.7 MiB/s | 25.8 MiB | 00m00s [125/151] Installing gdb-minimal-0:16.3 100% | 232.4 MiB/s | 13.2 MiB | 00m00s [126/151] Installing debugedit-0:5.1-7. 100% | 12.7 MiB/s | 195.4 KiB | 00m00s [127/151] Installing curl-0:8.11.1-6.fc 100% | 12.0 MiB/s | 453.1 KiB | 00m00s [128/151] Installing rpm-0:4.20.1-1.fc4 100% | 58.1 MiB/s | 2.5 MiB | 00m00s [129/151] Installing lua-srpm-macros-0: 100% | 1.9 MiB/s | 1.9 KiB | 00m00s [130/151] Installing tree-sitter-srpm-m 100% | 7.2 MiB/s | 7.4 KiB | 00m00s [131/151] Installing zig-srpm-macros-0: 100% | 1.6 MiB/s | 1.7 KiB | 00m00s [132/151] Installing efi-srpm-macros-0: 100% | 40.2 MiB/s | 41.1 KiB | 00m00s [133/151] Installing perl-srpm-macros-0 100% | 0.0 B/s | 1.1 KiB | 00m00s [134/151] Installing package-notes-srpm 100% | 2.0 MiB/s | 2.0 KiB | 00m00s [135/151] Installing openblas-srpm-macr 100% | 0.0 B/s | 392.0 B | 00m00s [136/151] Installing ocaml-srpm-macros- 100% | 2.1 MiB/s | 2.2 KiB | 00m00s [137/151] Installing kernel-srpm-macros 100% | 2.3 MiB/s | 2.3 KiB | 00m00s [138/151] Installing gnat-srpm-macros-0 100% | 1.2 MiB/s | 1.3 KiB | 00m00s [139/151] Installing ghc-srpm-macros-0: 100% | 1.0 MiB/s | 1.0 KiB | 00m00s [140/151] Installing fpc-srpm-macros-0: 100% | 0.0 B/s | 420.0 B | 00m00s [141/151] Installing ansible-srpm-macro 100% | 35.4 MiB/s | 36.2 KiB | 00m00s [142/151] Installing forge-srpm-macros- 100% | 39.3 MiB/s | 40.3 KiB | 00m00s [143/151] Installing fonts-srpm-macros- 100% | 55.7 MiB/s | 57.0 KiB | 00m00s [144/151] Installing go-srpm-macros-0:3 100% | 61.6 MiB/s | 63.0 KiB | 00m00s [145/151] Installing python-srpm-macros 100% | 50.9 MiB/s | 52.2 KiB | 00m00s [146/151] Installing redhat-rpm-config- 100% | 46.9 MiB/s | 192.2 KiB | 00m00s [147/151] Installing rpm-build-0:4.20.1 100% | 10.2 MiB/s | 177.4 KiB | 00m00s [148/151] Installing pyproject-srpm-mac 100% | 1.2 MiB/s | 2.5 KiB | 00m00s [149/151] Installing util-linux-0:2.40. 100% | 58.7 MiB/s | 3.5 MiB | 00m00s [150/151] Installing which-0:2.23-2.fc4 100% | 5.6 MiB/s | 85.7 KiB | 00m00s [151/151] Installing info-0:7.2-3.fc42. 100% | 134.4 KiB/s | 358.3 KiB | 00m03s Complete! Finish: installing minimal buildroot with dnf5 Start: creating root cache Finish: creating root cache Finish: chroot init INFO: Installed packages: INFO: add-determinism-0.6.0-1.fc42.x86_64 alternatives-1.33-1.fc42.x86_64 ansible-srpm-macros-1-17.1.fc42.noarch audit-libs-4.1.1-1.fc42.x86_64 basesystem-11-22.fc42.noarch bash-5.2.37-1.fc42.x86_64 binutils-2.44-6.fc42.x86_64 build-reproducibility-srpm-macros-0.6.0-1.fc42.noarch bzip2-1.0.8-20.fc42.x86_64 bzip2-libs-1.0.8-20.fc42.x86_64 ca-certificates-2025.2.80_v9.0.304-1.0.fc42.noarch coreutils-9.6-6.fc42.x86_64 coreutils-common-9.6-6.fc42.x86_64 cpio-2.15-4.fc42.x86_64 crypto-policies-20250707-1.gitad370a8.fc42.noarch curl-8.11.1-6.fc42.x86_64 cyrus-sasl-lib-2.1.28-30.fc42.x86_64 debugedit-5.1-7.fc42.x86_64 diffutils-3.12-1.fc42.x86_64 dwz-0.16-1.fc42.x86_64 ed-1.21-2.fc42.x86_64 efi-srpm-macros-6-3.fc42.noarch elfutils-0.193-2.fc42.x86_64 elfutils-debuginfod-client-0.193-2.fc42.x86_64 elfutils-default-yama-scope-0.193-2.fc42.noarch elfutils-libelf-0.193-2.fc42.x86_64 elfutils-libs-0.193-2.fc42.x86_64 fedora-gpg-keys-42-1.noarch fedora-release-42-30.noarch fedora-release-common-42-30.noarch fedora-release-identity-basic-42-30.noarch fedora-repos-42-1.noarch file-5.46-3.fc42.x86_64 file-libs-5.46-3.fc42.x86_64 filesystem-3.18-47.fc42.x86_64 filesystem-srpm-macros-3.18-47.fc42.noarch findutils-4.10.0-5.fc42.x86_64 fonts-srpm-macros-2.0.5-22.fc42.noarch forge-srpm-macros-0.4.0-2.fc42.noarch fpc-srpm-macros-1.3-14.fc42.noarch gawk-5.3.1-1.fc42.x86_64 gdb-minimal-16.3-1.fc42.x86_64 gdbm-libs-1.23-9.fc42.x86_64 ghc-srpm-macros-1.9.2-2.fc42.noarch glibc-2.41-11.fc42.x86_64 glibc-common-2.41-11.fc42.x86_64 glibc-gconv-extra-2.41-11.fc42.x86_64 glibc-minimal-langpack-2.41-11.fc42.x86_64 gmp-6.3.0-4.fc42.x86_64 gnat-srpm-macros-6-7.fc42.noarch gnulib-l10n-20241231-1.fc42.noarch go-srpm-macros-3.8.0-1.fc42.noarch gpg-pubkey-105ef944-65ca83d1 grep-3.11-10.fc42.x86_64 gzip-1.13-3.fc42.x86_64 info-7.2-3.fc42.x86_64 jansson-2.14-2.fc42.x86_64 json-c-0.18-2.fc42.x86_64 kernel-srpm-macros-1.0-25.fc42.noarch keyutils-libs-1.6.3-5.fc42.x86_64 krb5-libs-1.21.3-6.fc42.x86_64 libacl-2.3.2-3.fc42.x86_64 libarchive-3.8.1-1.fc42.x86_64 libattr-2.5.2-5.fc42.x86_64 libblkid-2.40.4-7.fc42.x86_64 libbrotli-1.1.0-6.fc42.x86_64 libcap-2.73-2.fc42.x86_64 libcap-ng-0.8.5-4.fc42.x86_64 libcom_err-1.47.2-3.fc42.x86_64 libcurl-8.11.1-6.fc42.x86_64 libeconf-0.7.6-2.fc42.x86_64 libevent-2.1.12-15.fc42.x86_64 libfdisk-2.40.4-7.fc42.x86_64 libffi-3.4.6-5.fc42.x86_64 libgcc-15.2.1-1.fc42.x86_64 libgomp-15.2.1-1.fc42.x86_64 libidn2-2.3.8-1.fc42.x86_64 libmount-2.40.4-7.fc42.x86_64 libnghttp2-1.64.0-3.fc42.x86_64 libpkgconf-2.3.0-2.fc42.x86_64 libpsl-0.21.5-5.fc42.x86_64 libselinux-3.8-3.fc42.x86_64 libsemanage-3.8.1-2.fc42.x86_64 libsepol-3.8-1.fc42.x86_64 libsmartcols-2.40.4-7.fc42.x86_64 libssh-0.11.3-1.fc42.x86_64 libssh-config-0.11.3-1.fc42.noarch libstdc++-15.2.1-1.fc42.x86_64 libtasn1-4.20.0-1.fc42.x86_64 libtool-ltdl-2.5.4-4.fc42.x86_64 libunistring-1.1-9.fc42.x86_64 libuuid-2.40.4-7.fc42.x86_64 libverto-0.3.2-10.fc42.x86_64 libxcrypt-4.4.38-7.fc42.x86_64 libxml2-2.12.10-1.fc42.x86_64 libzstd-1.5.7-1.fc42.x86_64 lua-libs-5.4.8-1.fc42.x86_64 lua-srpm-macros-1-15.fc42.noarch lz4-libs-1.10.0-2.fc42.x86_64 mpfr-4.2.2-1.fc42.x86_64 ncurses-base-6.5-5.20250125.fc42.noarch ncurses-libs-6.5-5.20250125.fc42.x86_64 ocaml-srpm-macros-10-4.fc42.noarch openblas-srpm-macros-2-19.fc42.noarch openldap-2.6.10-1.fc42.x86_64 openssl-libs-3.2.4-4.fc42.x86_64 p11-kit-0.25.8-1.fc42.x86_64 p11-kit-trust-0.25.8-1.fc42.x86_64 package-notes-srpm-macros-0.5-13.fc42.noarch pam-libs-1.7.0-6.fc42.x86_64 patch-2.8-1.fc42.x86_64 pcre2-10.45-1.fc42.x86_64 pcre2-syntax-10.45-1.fc42.noarch perl-srpm-macros-1-57.fc42.noarch pkgconf-2.3.0-2.fc42.x86_64 pkgconf-m4-2.3.0-2.fc42.noarch pkgconf-pkg-config-2.3.0-2.fc42.x86_64 popt-1.19-8.fc42.x86_64 publicsuffix-list-dafsa-20250616-1.fc42.noarch pyproject-srpm-macros-1.18.4-1.fc42.noarch python-srpm-macros-3.13-5.fc42.noarch qt5-srpm-macros-5.15.17-1.fc42.noarch qt6-srpm-macros-6.9.2-1.fc42.noarch readline-8.2-13.fc42.x86_64 redhat-rpm-config-342-4.fc42.noarch rpm-4.20.1-1.fc42.x86_64 rpm-build-4.20.1-1.fc42.x86_64 rpm-build-libs-4.20.1-1.fc42.x86_64 rpm-libs-4.20.1-1.fc42.x86_64 rpm-sequoia-1.7.0-5.fc42.x86_64 rust-srpm-macros-26.4-1.fc42.noarch sed-4.9-4.fc42.x86_64 setup-2.15.0-13.fc42.noarch shadow-utils-4.17.4-1.fc42.x86_64 sqlite-libs-3.47.2-5.fc42.x86_64 systemd-libs-257.9-2.fc42.x86_64 systemd-standalone-sysusers-257.9-2.fc42.x86_64 tar-1.35-5.fc42.x86_64 tree-sitter-srpm-macros-0.1.0-8.fc42.noarch unzip-6.0-66.fc42.x86_64 util-linux-2.40.4-7.fc42.x86_64 util-linux-core-2.40.4-7.fc42.x86_64 which-2.23-2.fc42.x86_64 xxhash-libs-0.8.3-2.fc42.x86_64 xz-5.8.1-2.fc42.x86_64 xz-libs-5.8.1-2.fc42.x86_64 zig-srpm-macros-1-4.fc42.noarch zip-3.0-43.fc42.x86_64 zlib-ng-compat-2.2.5-2.fc42.x86_64 zstd-1.5.7-1.fc42.x86_64 Start: buildsrpm Start: rpmbuild -bs Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1759363200 Wrote: /builddir/build/SRPMS/ollama-ggml-cuda-0.12.3-1.fc42.src.rpm Finish: rpmbuild -bs INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-42-x86_64-1759428480.475249/root/var/log/dnf5.log INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz /bin/tar: Removing leading `/' from member names Finish: buildsrpm INFO: Done(/var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda/ollama-ggml-cuda.spec) Config(child) 1 minutes 3 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot INFO: Start(/var/lib/copr-rpmbuild/results/ollama-ggml-cuda-0.12.3-1.fc42.src.rpm) Config(fedora-42-x86_64) Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-42-x86_64-bootstrap-1759428480.475249/root. INFO: reusing tmpfs at /var/lib/mock/fedora-42-x86_64-bootstrap-1759428480.475249/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-42-x86_64-1759428480.475249/root. INFO: calling preinit hooks INFO: enabled root cache Start: unpacking root cache Finish: unpacking root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-4.20.1-1.fc42.x86_64 rpm-sequoia-1.7.0-5.fc42.x86_64 dnf5-5.2.16.0-1.fc42.x86_64 dnf5-plugins-5.2.16.0-1.fc42.x86_64 Finish: chroot init Start: build phase for ollama-ggml-cuda-0.12.3-1.fc42.src.rpm Start: build setup for ollama-ggml-cuda-0.12.3-1.fc42.src.rpm Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1759363200 Wrote: /builddir/build/SRPMS/ollama-ggml-cuda-0.12.3-1.fc42.src.rpm Updating and loading repositories: Additional repo https_developer_downlo 100% | 12.9 KiB/s | 3.9 KiB | 00m00s Additional repo https_developer_downlo 100% | 12.9 KiB/s | 3.9 KiB | 00m00s Copr repository 100% | 4.9 KiB/s | 1.5 KiB | 00m00s fedora 100% | 82.3 KiB/s | 30.9 KiB | 00m00s updates 100% | 87.9 KiB/s | 29.8 KiB | 00m00s Repositories loaded. Package Arch Version Repository Size Installing: cmake x86_64 3.31.6-2.fc42 fedora 34.2 MiB cuda-compiler-12-9 x86_64 12.9.1-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 0.0 B cuda-compiler-13-0 x86_64 13.0.1-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 0.0 B cuda-libraries-devel-12-9 x86_64 12.9.1-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 0.0 B cuda-libraries-devel-13-0 x86_64 13.0.1-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 0.0 B cuda-nvml-devel-12-9 x86_64 12.9.79-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 1.4 MiB cuda-nvml-devel-13-0 x86_64 13.0.87-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 1.4 MiB gcc-c++ x86_64 15.2.1-1.fc42 updates 41.3 MiB gcc14 x86_64 14.2.1-8.fc42 fedora 117.2 MiB gcc14-c++ x86_64 14.2.1-8.fc42 fedora 59.6 MiB Installing dependencies: annobin-docs noarch 12.94-1.fc42 updates 98.9 KiB annobin-plugin-gcc x86_64 12.94-1.fc42 updates 993.5 KiB cmake-data noarch 3.31.6-2.fc42 fedora 8.5 MiB cmake-filesystem x86_64 3.31.6-2.fc42 fedora 0.0 B cmake-rpm-macros noarch 3.31.6-2.fc42 fedora 7.7 KiB cpp x86_64 15.2.1-1.fc42 updates 37.9 MiB cuda-cccl-12-9 x86_64 12.9.27-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 12.7 MiB cuda-cccl-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 13.2 MiB cuda-crt-12-9 x86_64 12.9.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 928.8 KiB cuda-crt-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 936.8 KiB cuda-cudart-12-9 x86_64 12.9.79-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 785.8 KiB cuda-cudart-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 754.1 KiB cuda-cudart-devel-12-9 x86_64 12.9.79-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 8.5 MiB cuda-cudart-devel-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 6.2 MiB cuda-culibos-devel-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 96.4 KiB cuda-cuobjdump-12-9 x86_64 12.9.82-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 665.7 KiB cuda-cuobjdump-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 750.4 KiB cuda-cuxxfilt-12-9 x86_64 12.9.82-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 1.0 MiB cuda-cuxxfilt-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 1.0 MiB cuda-driver-devel-12-9 x86_64 12.9.79-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 131.0 KiB cuda-driver-devel-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 135.3 KiB cuda-nvcc-12-9 x86_64 12.9.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 317.8 MiB cuda-nvcc-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 111.0 MiB cuda-nvprune-12-9 x86_64 12.9.82-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 181.0 KiB cuda-nvprune-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 181.3 KiB cuda-nvrtc-12-9 x86_64 12.9.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 216.9 MiB cuda-nvrtc-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 217.4 MiB cuda-nvrtc-devel-12-9 x86_64 12.9.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 248.0 MiB cuda-nvrtc-devel-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 244.5 MiB cuda-nvvm-12-9 x86_64 12.9.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 132.6 MiB cuda-opencl-12-9 x86_64 12.9.19-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 91.7 KiB cuda-opencl-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 96.5 KiB cuda-opencl-devel-12-9 x86_64 12.9.19-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 741.1 KiB cuda-opencl-devel-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 747.9 KiB cuda-profiler-api-12-9 x86_64 12.9.79-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 73.4 KiB cuda-profiler-api-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 77.6 KiB cuda-sandbox-devel-12-9 x86_64 12.9.19-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 146.3 KiB cuda-sandbox-devel-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 149.4 KiB cuda-toolkit-12-9-config-common noarch 12.9.79-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 0.0 B cuda-toolkit-12-config-common noarch 12.9.79-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 44.0 B cuda-toolkit-13-0-config-common noarch 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 0.0 B cuda-toolkit-13-config-common noarch 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 44.0 B cuda-toolkit-config-common noarch 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 41.0 B emacs-filesystem noarch 1:30.0-4.fc42 fedora 0.0 B expat x86_64 2.7.2-1.fc42 updates 298.6 KiB gcc x86_64 15.2.1-1.fc42 updates 111.2 MiB gcc-plugin-annobin x86_64 15.2.1-1.fc42 updates 57.1 KiB glibc-devel x86_64 2.41-11.fc42 updates 2.3 MiB jsoncpp x86_64 1.9.6-1.fc42 fedora 261.6 KiB kernel-headers x86_64 6.16.2-200.fc42 updates 6.7 MiB libb2 x86_64 0.98.1-13.fc42 fedora 46.1 KiB libcublas-12-9 x86_64 12.9.1.4-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 815.6 MiB libcublas-13-0 x86_64 13.0.2.14-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 567.2 MiB libcublas-devel-12-9 x86_64 12.9.1.4-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 1.2 GiB libcublas-devel-13-0 x86_64 13.0.2.14-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 961.6 MiB libcufft-12-9 x86_64 11.4.1.4-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 277.2 MiB libcufft-13-0 x86_64 12.0.0.61-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 274.3 MiB libcufft-devel-12-9 x86_64 11.4.1.4-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 567.3 MiB libcufft-devel-13-0 x86_64 12.0.0.61-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 280.5 MiB libcufile-12-9 x86_64 1.14.1.1-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 3.2 MiB libcufile-13-0 x86_64 1.15.1.6-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 3.2 MiB libcufile-devel-12-9 x86_64 1.14.1.1-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 27.9 MiB libcufile-devel-13-0 x86_64 1.15.1.6-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 27.9 MiB libcurand-12-9 x86_64 10.3.10.19-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 159.3 MiB libcurand-13-0 x86_64 10.4.0.35-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 126.6 MiB libcurand-devel-12-9 x86_64 10.3.10.19-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 161.3 MiB libcurand-devel-13-0 x86_64 10.4.0.35-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 129.0 MiB libcusolver-12-9 x86_64 11.7.5.82-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 470.6 MiB libcusolver-13-0 x86_64 12.0.4.66-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 233.8 MiB libcusolver-devel-12-9 x86_64 11.7.5.82-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 332.5 MiB libcusolver-devel-13-0 x86_64 12.0.4.66-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 180.9 MiB libcusparse-12-9 x86_64 12.5.10.65-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 463.0 MiB libcusparse-13-0 x86_64 12.6.3.3-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 155.1 MiB libcusparse-devel-12-9 x86_64 12.5.10.65-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 960.3 MiB libcusparse-devel-13-0 x86_64 12.6.3.3-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 348.7 MiB libmpc x86_64 1.3.1-7.fc42 fedora 164.5 KiB libnpp-12-9 x86_64 12.4.1.87-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 393.0 MiB libnpp-13-0 x86_64 13.0.1.2-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 157.3 MiB libnpp-devel-12-9 x86_64 12.4.1.87-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 406.2 MiB libnpp-devel-13-0 x86_64 13.0.1.2-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 184.5 MiB libnvfatbin-12-9 x86_64 12.9.82-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 2.4 MiB libnvfatbin-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 2.4 MiB libnvfatbin-devel-12-9 x86_64 12.9.82-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 2.3 MiB libnvfatbin-devel-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 2.3 MiB libnvjitlink-12-9 x86_64 12.9.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 91.6 MiB libnvjitlink-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 94.3 MiB libnvjitlink-devel-12-9 x86_64 12.9.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 127.6 MiB libnvjitlink-devel-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 130.0 MiB libnvjpeg-12-9 x86_64 12.4.0.76-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 9.0 MiB libnvjpeg-13-0 x86_64 13.0.1.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 5.7 MiB libnvjpeg-devel-12-9 x86_64 12.4.0.76-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 9.4 MiB libnvjpeg-devel-13-0 x86_64 13.0.1.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 6.4 MiB libnvptxcompiler-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 85.4 MiB libnvvm-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 133.6 MiB libstdc++-devel x86_64 15.2.1-1.fc42 updates 16.1 MiB libuv x86_64 1:1.51.0-1.fc42 updates 570.2 KiB libxcrypt-devel x86_64 4.4.38-7.fc42 updates 30.8 KiB make x86_64 1:4.4.1-10.fc42 fedora 1.8 MiB mpdecimal x86_64 4.0.1-1.fc42 updates 217.2 KiB python-pip-wheel noarch 24.3.1-5.fc42 updates 1.2 MiB python3 x86_64 3.13.7-1.fc42 updates 28.7 KiB python3-libs x86_64 3.13.7-1.fc42 updates 40.1 MiB rhash x86_64 1.4.5-2.fc42 fedora 351.0 KiB tzdata noarch 2025b-1.fc42 fedora 1.6 MiB vim-filesystem noarch 2:9.1.1775-1.fc42 updates 40.0 B Transaction Summary: Installing: 115 packages Total size of inbound packages is 7 GiB. Need to download 7 GiB. After this operation, 12 GiB extra will be used (install 12 GiB, remove 0 B). [ 1/115] cmake-0:3.31.6-2.fc42.x86_64 100% | 2.6 MiB/s | 12.2 MiB | 00m05s [ 2/115] cuda-compiler-12-9-0:12.9.1-1 100% | 28.1 KiB/s | 7.4 KiB | 00m00s [ 3/115] cuda-compiler-13-0-0:13.0.1-1 100% | 70.3 KiB/s | 7.5 KiB | 00m00s [ 4/115] cuda-libraries-devel-12-9-0:1 100% | 69.3 KiB/s | 7.9 KiB | 00m00s [ 5/115] cuda-libraries-devel-13-0-0:1 100% | 77.8 KiB/s | 7.9 KiB | 00m00s [ 6/115] cuda-nvml-devel-12-9-0:12.9.7 100% | 736.8 KiB/s | 201.2 KiB | 00m00s [ 7/115] cuda-nvml-devel-13-0-0:13.0.8 100% | 699.2 KiB/s | 218.9 KiB | 00m00s [ 8/115] gcc-c++-0:15.2.1-1.fc42.x86_6 100% | 46.0 MiB/s | 15.3 MiB | 00m00s [ 9/115] libmpc-0:1.3.1-7.fc42.x86_64 100% | 417.1 KiB/s | 70.9 KiB | 00m00s [ 10/115] make-1:4.4.1-10.fc42.x86_64 100% | 1.3 MiB/s | 587.0 KiB | 00m00s [ 11/115] cmake-data-0:3.31.6-2.fc42.no 100% | 2.5 MiB/s | 2.5 MiB | 00m01s [ 12/115] cmake-filesystem-0:3.31.6-2.f 100% | 137.4 KiB/s | 17.6 KiB | 00m00s [ 13/115] jsoncpp-0:1.9.6-1.fc42.x86_64 100% | 915.9 KiB/s | 103.5 KiB | 00m00s [ 14/115] rhash-0:1.4.5-2.fc42.x86_64 100% | 1.1 MiB/s | 198.7 KiB | 00m00s [ 15/115] cuda-cuobjdump-12-9-0:12.9.82 100% | 905.2 KiB/s | 277.9 KiB | 00m00s [ 16/115] cuda-cuxxfilt-12-9-0:12.9.82- 100% | 1.2 MiB/s | 282.8 KiB | 00m00s [ 17/115] cuda-nvcc-12-9-0:12.9.86-1.x8 100% | 26.3 MiB/s | 111.3 MiB | 00m04s [ 18/115] cuda-nvprune-12-9-0:12.9.82-1 100% | 623.0 KiB/s | 76.0 KiB | 00m00s [ 19/115] cuda-crt-13-0-0:13.0.88-1.x86 100% | 983.2 KiB/s | 120.9 KiB | 00m00s [ 20/115] cuda-cuobjdump-13-0-0:13.0.85 100% | 2.3 MiB/s | 309.5 KiB | 00m00s [ 21/115] cuda-cuxxfilt-13-0-0:13.0.85- 100% | 2.2 MiB/s | 283.6 KiB | 00m00s [ 22/115] cuda-nvcc-13-0-0:13.0.88-1.x8 100% | 22.1 MiB/s | 35.3 MiB | 00m02s [ 23/115] cuda-nvprune-13-0-0:13.0.85-1 100% | 554.9 KiB/s | 76.6 KiB | 00m00s [ 24/115] libnvptxcompiler-13-0-0:13.0. 100% | 25.9 MiB/s | 21.3 MiB | 00m01s [ 25/115] gcc14-c++-0:14.2.1-8.fc42.x86 100% | 1.0 MiB/s | 16.8 MiB | 00m17s [ 26/115] cuda-cccl-12-9-0:12.9.27-1.x8 100% | 3.3 MiB/s | 1.7 MiB | 00m01s [ 27/115] cuda-cudart-devel-12-9-0:12.9 100% | 8.0 MiB/s | 3.0 MiB | 00m00s [ 28/115] cuda-driver-devel-12-9-0:12.9 100% | 116.1 KiB/s | 43.1 KiB | 00m00s [ 29/115] libnvvm-13-0-0:13.0.88-1.x86_ 100% | 27.7 MiB/s | 58.3 MiB | 00m02s [ 30/115] cuda-opencl-devel-12-9-0:12.9 100% | 307.0 KiB/s | 119.4 KiB | 00m00s [ 31/115] cuda-profiler-api-12-9-0:12.9 100% | 127.3 KiB/s | 26.2 KiB | 00m00s [ 32/115] cuda-sandbox-devel-12-9-0:12. 100% | 159.1 KiB/s | 44.2 KiB | 00m00s [ 33/115] cuda-nvrtc-devel-12-9-0:12.9. 100% | 18.2 MiB/s | 74.2 MiB | 00m04s [ 34/115] libcufft-devel-12-9-0:11.4.1. 100% | 18.2 MiB/s | 385.6 MiB | 00m21s [ 35/115] libcufile-devel-12-9-0:1.14.1 100% | 6.8 MiB/s | 5.2 MiB | 00m01s [ 36/115] libcurand-devel-12-9-0:10.3.1 100% | 17.0 MiB/s | 64.2 MiB | 00m04s [ 37/115] libcublas-devel-12-9-0:12.9.1 100% | 19.1 MiB/s | 630.3 MiB | 00m33s [ 38/115] libcusolver-devel-12-9-0:11.7 100% | 18.3 MiB/s | 213.1 MiB | 00m12s [ 39/115] libnpp-devel-12-9-0:12.4.1.87 100% | 20.8 MiB/s | 268.0 MiB | 00m13s [ 40/115] libnvfatbin-devel-12-9-0:12.9 100% | 3.5 MiB/s | 863.8 KiB | 00m00s [ 41/115] gcc14-0:14.2.1-8.fc42.x86_64 100% | 608.8 KiB/s | 43.8 MiB | 01m14s [ 42/115] libnvjpeg-devel-12-9-0:12.4.0 100% | 4.1 MiB/s | 4.9 MiB | 00m01s [ 43/115] libnvjitlink-devel-12-9-0:12. 100% | 14.9 MiB/s | 36.1 MiB | 00m02s [ 44/115] cuda-cccl-13-0-0:13.0.85-1.x8 100% | 3.3 MiB/s | 1.7 MiB | 00m01s [ 45/115] cuda-culibos-devel-13-0-0:13. 100% | 204.4 KiB/s | 32.5 KiB | 00m00s [ 46/115] cuda-cudart-devel-13-0-0:13.0 100% | 3.7 MiB/s | 1.9 MiB | 00m01s [ 47/115] cuda-driver-devel-13-0-0:13.0 100% | 136.6 KiB/s | 44.3 KiB | 00m00s [ 48/115] cuda-opencl-devel-13-0-0:13.0 100% | 202.0 KiB/s | 120.8 KiB | 00m01s [ 49/115] cuda-profiler-api-13-0-0:13.0 100% | 52.7 KiB/s | 27.1 KiB | 00m01s [ 50/115] cuda-sandbox-devel-13-0-0:13. 100% | 256.1 KiB/s | 45.3 KiB | 00m00s [ 51/115] cuda-nvrtc-devel-13-0-0:13.0. 100% | 16.4 MiB/s | 73.7 MiB | 00m04s [ 52/115] libcufft-devel-13-0-0:12.0.0. 100% | 13.7 MiB/s | 205.4 MiB | 00m15s [ 53/115] libcufile-devel-13-0-0:1.15.1 100% | 9.6 MiB/s | 5.2 MiB | 00m01s [ 54/115] libcusparse-devel-12-9-0:12.5 100% | 15.3 MiB/s | 710.9 MiB | 00m46s [ 55/115] libcurand-devel-13-0-0:10.4.0 100% | 12.6 MiB/s | 56.0 MiB | 00m04s [ 56/115] libcusolver-devel-13-0-0:12.0 100% | 13.4 MiB/s | 124.4 MiB | 00m09s [ 57/115] libcublas-devel-13-0-0:13.0.2 100% | 13.5 MiB/s | 470.7 MiB | 00m35s [ 58/115] libnvfatbin-devel-13-0-0:13.0 100% | 846.1 KiB/s | 877.4 KiB | 00m01s [ 59/115] libnvjitlink-devel-13-0-0:13. 100% | 8.0 MiB/s | 36.7 MiB | 00m05s [ 60/115] libnpp-devel-13-0-0:13.0.1.2- 100% | 11.6 MiB/s | 125.6 MiB | 00m11s [ 61/115] libnvjpeg-devel-13-0-0:13.0.1 100% | 3.5 MiB/s | 3.4 MiB | 00m01s [ 62/115] emacs-filesystem-1:30.0-4.fc4 100% | 26.1 KiB/s | 7.4 KiB | 00m00s [ 63/115] gcc-0:15.2.1-1.fc42.x86_64 100% | 59.0 MiB/s | 39.4 MiB | 00m01s [ 64/115] cuda-crt-12-9-0:12.9.86-1.x86 100% | 295.5 KiB/s | 119.7 KiB | 00m00s [ 65/115] cuda-cudart-12-9-0:12.9.79-1. 100% | 539.4 KiB/s | 236.8 KiB | 00m00s [ 66/115] libcusparse-devel-13-0-0:12.6 100% | 14.0 MiB/s | 286.7 MiB | 00m20s [ 67/115] cuda-opencl-12-9-0:12.9.19-1. 100% | 342.4 KiB/s | 34.2 KiB | 00m00s [ 68/115] cuda-nvvm-12-9-0:12.9.86-1.x8 100% | 13.5 MiB/s | 57.6 MiB | 00m04s [ 69/115] cuda-nvrtc-12-9-0:12.9.86-1.x 100% | 13.4 MiB/s | 84.8 MiB | 00m06s [ 70/115] libcufile-12-9-0:1.14.1.1-1.x 100% | 962.1 KiB/s | 1.2 MiB | 00m01s [ 71/115] libcurand-12-9-0:10.3.10.19-1 100% | 12.4 MiB/s | 63.9 MiB | 00m05s [ 72/115] libcufft-12-9-0:11.4.1.4-1.x8 100% | 12.8 MiB/s | 191.7 MiB | 00m15s [ 73/115] libcusolver-12-9-0:11.7.5.82- 100% | 11.9 MiB/s | 324.9 MiB | 00m27s [ 74/115] libcublas-12-9-0:12.9.1.4-1.x 100% | 12.7 MiB/s | 555.4 MiB | 00m44s [ 75/115] libnvfatbin-12-9-0:12.9.82-1. 100% | 1.5 MiB/s | 940.1 KiB | 00m01s [ 76/115] libcusparse-12-9-0:12.5.10.65 100% | 12.8 MiB/s | 351.7 MiB | 00m27s [ 77/115] libnvjpeg-12-9-0:12.4.0.76-1. 100% | 5.6 MiB/s | 5.1 MiB | 00m01s [ 78/115] cuda-cudart-13-0-0:13.0.88-1. 100% | 402.0 KiB/s | 223.1 KiB | 00m01s [ 79/115] libnvjitlink-12-9-0:12.9.86-1 100% | 14.6 MiB/s | 37.6 MiB | 00m03s [ 80/115] cuda-opencl-13-0-0:13.0.85-1. 100% | 560.1 KiB/s | 35.3 KiB | 00m00s [ 81/115] cuda-nvrtc-13-0-0:13.0.88-1.x 100% | 11.5 MiB/s | 85.4 MiB | 00m07s [ 82/115] libnpp-12-9-0:12.4.1.87-1.x86 100% | 13.3 MiB/s | 271.1 MiB | 00m20s [ 83/115] libcufile-13-0-0:1.15.1.6-1.x 100% | 1.2 MiB/s | 1.2 MiB | 00m01s [ 84/115] libcurand-13-0-0:10.4.0.35-1. 100% | 7.7 MiB/s | 55.7 MiB | 00m07s [ 85/115] libcufft-13-0-0:12.0.0.61-1.x 100% | 10.3 MiB/s | 204.4 MiB | 00m20s [ 86/115] libcusolver-13-0-0:12.0.4.66- 100% | 10.5 MiB/s | 191.4 MiB | 00m18s [ 87/115] libcusparse-13-0-0:12.6.3.3-1 100% | 9.6 MiB/s | 139.2 MiB | 00m15s [ 88/115] libcublas-13-0-0:13.0.2.14-1. 100% | 9.6 MiB/s | 401.1 MiB | 00m42s [ 89/115] libnvfatbin-13-0-0:13.0.85-1. 100% | 1.5 MiB/s | 950.0 KiB | 00m01s [ 90/115] libnvjpeg-13-0-0:13.0.1.86-1. 100% | 2.9 MiB/s | 3.5 MiB | 00m01s [ 91/115] cpp-0:15.2.1-1.fc42.x86_64 100% | 21.7 MiB/s | 12.9 MiB | 00m01s [ 92/115] libstdc++-devel-0:15.2.1-1.fc 100% | 27.6 MiB/s | 2.9 MiB | 00m00s [ 93/115] glibc-devel-0:2.41-11.fc42.x8 100% | 23.4 MiB/s | 623.2 KiB | 00m00s [ 94/115] vim-filesystem-2:9.1.1775-1.f 100% | 1.1 MiB/s | 15.4 KiB | 00m00s [ 95/115] expat-0:2.7.2-1.fc42.x86_64 100% | 7.3 MiB/s | 119.0 KiB | 00m00s [ 96/115] libuv-1:1.51.0-1.fc42.x86_64 100% | 13.0 MiB/s | 266.3 KiB | 00m00s [ 97/115] cuda-toolkit-config-common-0: 100% | 29.5 KiB/s | 8.0 KiB | 00m00s [ 98/115] cuda-toolkit-13-0-config-comm 100% | 20.1 KiB/s | 7.8 KiB | 00m00s [ 99/115] cuda-toolkit-13-config-common 100% | 17.6 KiB/s | 8.0 KiB | 00m00s [100/115] cuda-toolkit-12-9-config-comm 100% | 28.3 KiB/s | 7.8 KiB | 00m00s [101/115] libnvjitlink-13-0-0:13.0.88-1 100% | 10.7 MiB/s | 38.5 MiB | 00m04s [102/115] kernel-headers-0:6.16.2-200.f 100% | 26.7 MiB/s | 1.7 MiB | 00m00s [103/115] libxcrypt-devel-0:4.4.38-7.fc 100% | 2.0 MiB/s | 29.4 KiB | 00m00s [104/115] gcc-plugin-annobin-0:15.2.1-1 100% | 3.4 MiB/s | 55.8 KiB | 00m00s [105/115] cuda-toolkit-12-config-common 100% | 36.9 KiB/s | 8.0 KiB | 00m00s [106/115] annobin-plugin-gcc-0:12.94-1. 100% | 30.0 MiB/s | 981.9 KiB | 00m00s [107/115] annobin-docs-0:12.94-1.fc42.n 100% | 1.7 MiB/s | 90.4 KiB | 00m00s [108/115] python3-0:3.13.7-1.fc42.x86_6 100% | 2.1 MiB/s | 30.6 KiB | 00m00s [109/115] cmake-rpm-macros-0:3.31.6-2.f 100% | 67.6 KiB/s | 16.9 KiB | 00m00s [110/115] python3-libs-0:3.13.7-1.fc42. 100% | 52.9 MiB/s | 9.2 MiB | 00m00s [111/115] libb2-0:0.98.1-13.fc42.x86_64 100% | 224.6 KiB/s | 25.4 KiB | 00m00s [112/115] mpdecimal-0:4.0.1-1.fc42.x86_ 100% | 6.3 MiB/s | 97.1 KiB | 00m00s [113/115] python-pip-wheel-0:24.3.1-5.f 100% | 26.2 MiB/s | 1.2 MiB | 00m00s [114/115] libnpp-13-0-0:13.0.1.2-1.x86_ 100% | 17.1 MiB/s | 127.8 MiB | 00m07s [115/115] tzdata-0:2025b-1.fc42.noarch 100% | 1.3 MiB/s | 714.0 KiB | 00m01s -------------------------------------------------------------------------------- [115/115] Total 100% | 34.4 MiB/s | 7.2 GiB | 03m34s Running transaction [ 1/117] Verify package files 100% | 3.0 B/s | 115.0 B | 00m35s [ 2/117] Prepare transaction 100% | 858.0 B/s | 115.0 B | 00m00s [ 3/117] Installing cuda-toolkit-confi 100% | 0.0 B/s | 312.0 B | 00m00s [ 4/117] Installing cuda-toolkit-12-co 100% | 0.0 B/s | 316.0 B | 00m00s [ 5/117] Installing cuda-toolkit-12-9- 100% | 0.0 B/s | 124.0 B | 00m00s [ 6/117] Installing cuda-toolkit-13-co 100% | 0.0 B/s | 316.0 B | 00m00s [ 7/117] Installing cuda-toolkit-13-0- 100% | 0.0 B/s | 124.0 B | 00m00s [ 8/117] Installing cuda-culibos-devel 100% | 94.7 MiB/s | 97.0 KiB | 00m00s [ 9/117] Installing libmpc-0:1.3.1-7.f 100% | 81.1 MiB/s | 166.1 KiB | 00m00s [ 10/117] Installing make-1:4.4.1-10.fc 100% | 72.0 MiB/s | 1.8 MiB | 00m00s [ 11/117] Installing expat-0:2.7.2-1.fc 100% | 15.5 MiB/s | 300.7 KiB | 00m00s [ 12/117] Installing libstdc++-devel-0: 100% | 188.6 MiB/s | 16.2 MiB | 00m00s [ 13/117] Installing cuda-cccl-13-0-0:1 100% | 99.2 MiB/s | 13.6 MiB | 00m00s [ 14/117] Installing cuda-cccl-12-9-0:1 100% | 104.5 MiB/s | 13.1 MiB | 00m00s [ 15/117] Installing libnvvm-13-0-0:13. 100% | 197.6 MiB/s | 133.6 MiB | 00m01s [ 16/117] Installing libnvptxcompiler-1 100% | 274.7 MiB/s | 85.4 MiB | 00m00s [ 17/117] Installing cuda-crt-13-0-0:13 100% | 184.0 MiB/s | 942.2 KiB | 00m00s [ 18/117] Installing cmake-filesystem-0 100% | 2.5 MiB/s | 7.6 KiB | 00m00s [ 19/117] Installing cpp-0:15.2.1-1.fc4 100% | 269.1 MiB/s | 37.9 MiB | 00m00s [ 20/117] Installing cuda-sandbox-devel 100% | 74.1 MiB/s | 151.7 KiB | 00m00s [ 21/117] Installing cuda-cudart-13-0-0 100% | 36.9 MiB/s | 755.6 KiB | 00m00s [ 22/117] Installing cuda-cudart-devel- 100% | 231.9 MiB/s | 6.3 MiB | 00m00s [ 23/117] Installing cuda-opencl-13-0-0 100% | 8.0 MiB/s | 98.1 KiB | 00m00s [ 24/117] Installing cuda-opencl-devel- 100% | 183.4 MiB/s | 751.3 KiB | 00m00s [ 25/117] Installing libcublas-13-0-0:1 100% | 254.8 MiB/s | 567.2 MiB | 00m02s [ 26/117] Installing libcublas-devel-13 100% | 276.2 MiB/s | 961.6 MiB | 00m03s [ 27/117] Installing libcufft-13-0-0:12 100% | 167.4 MiB/s | 274.3 MiB | 00m02s [ 28/117] Installing libcufft-devel-13- 100% | 172.7 MiB/s | 280.5 MiB | 00m02s [ 29/117] Installing libcufile-13-0-0:1 100% | 100.4 MiB/s | 3.2 MiB | 00m00s [ 30/117] Installing libcufile-devel-13 100% | 297.0 MiB/s | 27.9 MiB | 00m00s [ 31/117] Installing libcurand-13-0-0:1 100% | 277.1 MiB/s | 126.6 MiB | 00m00s [ 32/117] Installing libcurand-devel-13 100% | 285.9 MiB/s | 129.0 MiB | 00m00s [ 33/117] Installing libcusolver-13-0-0 100% | 274.8 MiB/s | 233.8 MiB | 00m01s [ 34/117] Installing libcusolver-devel- 100% | 281.4 MiB/s | 180.9 MiB | 00m01s [ 35/117] Installing libcusparse-13-0-0 100% | 274.5 MiB/s | 155.1 MiB | 00m01s [ 36/117] Installing libcusparse-devel- 100% | 397.6 MiB/s | 348.7 MiB | 00m01s [ 37/117] Installing libnpp-13-0-0:13.0 100% | 254.6 MiB/s | 157.4 MiB | 00m01s [ 38/117] Installing libnpp-devel-13-0- 100% | 285.6 MiB/s | 184.5 MiB | 00m01s [ 39/117] Installing libnvfatbin-13-0-0 100% | 83.4 MiB/s | 2.4 MiB | 00m00s [ 40/117] Installing libnvfatbin-devel- 100% | 180.2 MiB/s | 2.3 MiB | 00m00s [ 41/117] Installing libnvjitlink-13-0- 100% | 195.2 MiB/s | 94.3 MiB | 00m00s [ 42/117] Installing libnvjitlink-devel 100% | 234.6 MiB/s | 130.0 MiB | 00m01s [ 43/117] Installing libnvjpeg-13-0-0:1 100% | 138.2 MiB/s | 5.7 MiB | 00m00s [ 44/117] Installing libnvjpeg-devel-13 100% | 238.0 MiB/s | 6.4 MiB | 00m00s [ 45/117] Installing cuda-sandbox-devel 100% | 145.1 MiB/s | 148.6 KiB | 00m00s [ 46/117] Installing cuda-cudart-12-9-0 100% | 40.5 MiB/s | 787.3 KiB | 00m00s [ 47/117] Installing cuda-cudart-devel- 100% | 184.4 MiB/s | 8.5 MiB | 00m00s [ 48/117] Installing cuda-opencl-12-9-0 100% | 6.1 MiB/s | 93.4 KiB | 00m00s [ 49/117] Installing cuda-opencl-devel- 100% | 181.8 MiB/s | 744.4 KiB | 00m00s [ 50/117] Installing libcublas-12-9-0:1 100% | 202.9 MiB/s | 815.6 MiB | 00m04s [ 51/117] Installing libcublas-devel-12 100% | 224.3 MiB/s | 1.2 GiB | 00m05s [ 52/117] Installing libcufft-12-9-0:11 100% | 170.4 MiB/s | 277.2 MiB | 00m02s [ 53/117] Installing libcufft-devel-12- 100% | 169.6 MiB/s | 567.3 MiB | 00m03s [ 54/117] Installing libcufile-12-9-0:1 100% | 101.2 MiB/s | 3.2 MiB | 00m00s [ 55/117] Installing libcufile-devel-12 100% | 310.1 MiB/s | 27.9 MiB | 00m00s [ 56/117] Installing libcurand-12-9-0:1 100% | 267.3 MiB/s | 159.3 MiB | 00m01s [ 57/117] Installing libcurand-devel-12 100% | 226.2 MiB/s | 161.3 MiB | 00m01s [ 58/117] Installing libcusolver-12-9-0 100% | 146.1 MiB/s | 470.6 MiB | 00m03s [ 59/117] Installing libcusolver-devel- 100% | 111.2 MiB/s | 332.5 MiB | 00m03s [ 60/117] Installing libcusparse-12-9-0 100% | 143.0 MiB/s | 463.0 MiB | 00m03s [ 61/117] Installing libcusparse-devel- 100% | 161.9 MiB/s | 960.3 MiB | 00m06s [ 62/117] Installing libnpp-12-9-0:12.4 100% | 147.1 MiB/s | 393.0 MiB | 00m03s [ 63/117] Installing libnpp-devel-12-9- 100% | 154.5 MiB/s | 406.2 MiB | 00m03s [ 64/117] Installing libnvfatbin-12-9-0 100% | 85.6 MiB/s | 2.4 MiB | 00m00s [ 65/117] Installing libnvfatbin-devel- 100% | 192.3 MiB/s | 2.3 MiB | 00m00s [ 66/117] Installing libnvjitlink-12-9- 100% | 207.6 MiB/s | 91.6 MiB | 00m00s [ 67/117] Installing libnvjitlink-devel 100% | 244.4 MiB/s | 127.6 MiB | 00m01s [ 68/117] Installing libnvjpeg-12-9-0:1 100% | 136.1 MiB/s | 9.0 MiB | 00m00s [ 69/117] Installing libnvjpeg-devel-12 100% | 167.7 MiB/s | 9.4 MiB | 00m00s [ 70/117] Installing python-pip-wheel-0 100% | 103.7 MiB/s | 1.2 MiB | 00m00s [ 71/117] Installing mpdecimal-0:4.0.1- 100% | 19.4 MiB/s | 218.8 KiB | 00m00s [ 72/117] Installing tzdata-0:2025b-1.f 100% | 21.5 MiB/s | 1.9 MiB | 00m00s [ 73/117] Installing libb2-0:0.98.1-13. 100% | 5.1 MiB/s | 47.2 KiB | 00m00s [ 74/117] Installing python3-libs-0:3.1 100% | 161.7 MiB/s | 40.4 MiB | 00m00s [ 75/117] Installing python3-0:3.13.7-1 100% | 1.4 MiB/s | 30.5 KiB | 00m00s [ 76/117] Installing cmake-rpm-macros-0 100% | 4.1 MiB/s | 8.3 KiB | 00m00s [ 77/117] Installing annobin-docs-0:12. 100% | 24.4 MiB/s | 100.0 KiB | 00m00s [ 78/117] Installing kernel-headers-0:6 100% | 106.7 MiB/s | 6.8 MiB | 00m00s [ 79/117] Installing libxcrypt-devel-0: 100% | 8.1 MiB/s | 33.1 KiB | 00m00s [ 80/117] Installing glibc-devel-0:2.41 100% | 75.2 MiB/s | 2.3 MiB | 00m00s [ 81/117] Installing gcc-0:15.2.1-1.fc4 100% | 282.5 MiB/s | 111.3 MiB | 00m00s [ 82/117] Installing gcc-c++-0:15.2.1-1 100% | 261.7 MiB/s | 41.4 MiB | 00m00s [ 83/117] Installing cuda-nvcc-13-0-0:1 100% | 180.2 MiB/s | 111.0 MiB | 00m01s [ 84/117] Installing gcc14-0:14.2.1-8.f 100% | 288.8 MiB/s | 117.2 MiB | 00m00s [ 85/117] Installing libuv-1:1.51.0-1.f 100% | 139.9 MiB/s | 573.0 KiB | 00m00s [ 86/117] Installing vim-filesystem-2:9 100% | 2.3 MiB/s | 4.7 KiB | 00m00s [ 87/117] Installing cuda-nvrtc-13-0-0: 100% | 212.5 MiB/s | 217.4 MiB | 00m01s [ 88/117] Installing cuda-nvrtc-devel-1 100% | 239.7 MiB/s | 244.5 MiB | 00m01s [ 89/117] Installing cuda-nvrtc-12-9-0: 100% | 209.7 MiB/s | 216.9 MiB | 00m01s [ 90/117] Installing cuda-nvrtc-devel-1 100% | 168.6 MiB/s | 248.0 MiB | 00m01s [ 91/117] Installing cuda-nvvm-12-9-0:1 100% | 168.6 MiB/s | 132.7 MiB | 00m01s [ 92/117] Installing cuda-crt-12-9-0:12 100% | 130.3 MiB/s | 933.9 KiB | 00m00s [ 93/117] Installing cuda-nvcc-12-9-0:1 100% | 180.0 MiB/s | 317.8 MiB | 00m02s [ 94/117] Installing emacs-filesystem-1 100% | 177.1 KiB/s | 544.0 B | 00m00s [ 95/117] Installing cuda-profiler-api- 100% | 38.6 MiB/s | 79.1 KiB | 00m00s [ 96/117] Installing cuda-driver-devel- 100% | 44.6 MiB/s | 137.0 KiB | 00m00s [ 97/117] Installing cuda-profiler-api- 100% | 24.4 MiB/s | 74.9 KiB | 00m00s [ 98/117] Installing cuda-driver-devel- 100% | 64.8 MiB/s | 132.8 KiB | 00m00s [ 99/117] Installing cuda-nvprune-13-0- 100% | 59.3 MiB/s | 182.1 KiB | 00m00s [100/117] Installing cuda-cuxxfilt-13-0 100% | 131.1 MiB/s | 1.0 MiB | 00m00s [101/117] Installing cuda-cuobjdump-13- 100% | 104.8 MiB/s | 751.3 KiB | 00m00s [102/117] Installing cuda-nvprune-12-9- 100% | 88.8 MiB/s | 181.8 KiB | 00m00s [103/117] Installing cuda-cuxxfilt-12-9 100% | 116.1 MiB/s | 1.0 MiB | 00m00s [104/117] Installing cuda-cuobjdump-12- 100% | 81.4 MiB/s | 666.6 KiB | 00m00s [105/117] Installing rhash-0:1.4.5-2.fc 100% | 13.9 MiB/s | 356.4 KiB | 00m00s [106/117] Installing jsoncpp-0:1.9.6-1. 100% | 21.4 MiB/s | 263.1 KiB | 00m00s [107/117] Installing cmake-data-0:3.31. 100% | 48.5 MiB/s | 9.1 MiB | 00m00s [108/117] Installing cmake-0:3.31.6-2.f 100% | 263.3 MiB/s | 34.2 MiB | 00m00s [109/117] Installing cuda-compiler-12-9 100% | 0.0 B/s | 124.0 B | 00m00s [110/117] Installing cuda-compiler-13-0 100% | 0.0 B/s | 124.0 B | 00m00s [111/117] Installing cuda-libraries-dev 100% | 0.0 B/s | 124.0 B | 00m00s [112/117] Installing cuda-libraries-dev 100% | 60.5 KiB/s | 124.0 B | 00m00s [113/117] Installing gcc14-c++-0:14.2.1 100% | 252.3 MiB/s | 59.8 MiB | 00m00s [114/117] Installing gcc-plugin-annobin 100% | 2.1 MiB/s | 58.6 KiB | 00m00s [115/117] Installing annobin-plugin-gcc 100% | 31.3 MiB/s | 995.1 KiB | 00m00s [116/117] Installing cuda-nvml-devel-13 100% | 177.6 MiB/s | 1.4 MiB | 00m00s [117/117] Installing cuda-nvml-devel-12 100% | 5.1 MiB/s | 1.4 MiB | 00m00s Warning: skipped OpenPGP checks for 85 packages from repositories: https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64, https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 Complete! Finish: build setup for ollama-ggml-cuda-0.12.3-1.fc42.src.rpm Start: rpmbuild ollama-ggml-cuda-0.12.3-1.fc42.src.rpm Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1759363200 Executing(%mkbuilddir): /bin/sh -e /var/tmp/rpm-tmp.M94KKc Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.9bqWEW + umask 022 + cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + rm -rf ollama-0.12.3 + /usr/lib/rpm/rpmuncompress -x /builddir/build/SOURCES/v0.12.3.tar.gz + STATUS=0 + '[' 0 -ne 0 ']' + cd ollama-0.12.3 + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w . + /usr/lib/rpm/rpmuncompress /builddir/build/SOURCES/remove-runtime-for-cuda-and-rocm.patch + /usr/bin/patch -p1 -s --fuzz=0 --no-backup-if-mismatch -f + /usr/lib/rpm/rpmuncompress /builddir/build/SOURCES/replace-library-paths.patch + /usr/bin/patch -p1 -s --fuzz=0 --no-backup-if-mismatch -f + cp -a /usr/local/cuda-12/ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/ + patch -p1 -d /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/ patching file include/crt/math_functions.h Hunk #1 succeeded at 2553 with fuzz 1. Hunk #2 succeeded at 2576 with fuzz 1. Hunk #3 succeeded at 2598 with fuzz 1. patch unexpectedly ends in middle of line Hunk #4 succeeded at 2620 with fuzz 1. + RPM_EC=0 ++ jobs -p + exit 0 Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.ey80Lu + umask 022 + cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CXXFLAGS + FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=gcc + export CC + CXX=g++ + export CXX + cd ollama-0.12.3 + CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CXXFLAGS + FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=gcc + export CC + CXX=g++ + export CXX + /usr/bin/cmake -S . -B redhat-linux-build_cuda-13 -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DCMAKE_INSTALL_FULL_SBINDIR:PATH=/usr/bin -DCMAKE_INSTALL_SBINDIR:PATH=bin -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON --preset 'CUDA 13' -DOLLAMA_RUNNER_DIR=cuda_v13 -DCMAKE_CUDA_COMPILER=/usr/local/cuda-13/bin/nvcc -DCMAKE_CUDA_FLAGS_RELEASE=-DNDEBUG '-DCMAKE_CUDA_FLAGS=-O2 -g -Xcompiler "-fPIC"' Preset CMake variables: CMAKE_BUILD_TYPE="Release" CMAKE_CUDA_ARCHITECTURES="75-virtual;80-virtual;86-virtual;87-virtual;89-virtual;90-virtual;90a-virtual;100-virtual;110-virtual;120-virtual;121-virtual" CMAKE_MSVC_RUNTIME_LIBRARY="MultiThreaded" -- The C compiler identification is GNU 15.2.1 -- The CXX compiler identification is GNU 15.2.1 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/gcc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/g++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF -- CMAKE_SYSTEM_PROCESSOR: x86_64 -- GGML_SYSTEM_ARCH: x86 -- Including CPU backend -- x86 detected -- Adding CPU backend variant ggml-cpu-x64: -- x86 detected -- Adding CPU backend variant ggml-cpu-sse42: -msse4.2 GGML_SSE42 -- x86 detected -- Adding CPU backend variant ggml-cpu-sandybridge: -msse4.2;-mavx GGML_SSE42;GGML_AVX -- x86 detected -- Adding CPU backend variant ggml-cpu-haswell: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2 GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2 -- x86 detected -- Adding CPU backend variant ggml-cpu-skylakex: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512 -- x86 detected -- Adding CPU backend variant ggml-cpu-icelake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw;-mavx512vbmi;-mavx512vnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512;GGML_AVX512_VBMI;GGML_AVX512_VNNI -- x86 detected -- Adding CPU backend variant ggml-cpu-alderlake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavxvnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX_VNNI -- Found CUDAToolkit: /usr/local/cuda-13/targets/x86_64-linux/include (found version "13.0.88") -- CUDA Toolkit found -- Using CUDA architectures: 75-virtual;80-virtual;86-virtual;87-virtual;89-virtual;90-virtual;90a-virtual;100-virtual;110-virtual;120-virtual;121-virtual -- The CUDA compiler identification is NVIDIA 13.0.88 with host compiler GNU 15.2.1 -- Detecting CUDA compiler ABI info -- Detecting CUDA compiler ABI info - done -- Check for working CUDA compiler: /usr/local/cuda-13/bin/nvcc - skipped -- Detecting CUDA compile features -- Detecting CUDA compile features - done -- Looking for a HIP compiler -- Looking for a HIP compiler - NOTFOUND -- Configuring done (8.7s) -- Generating done (0.1s) CMake Warning: Manually-specified variables were not used by the project: CMAKE_Fortran_FLAGS_RELEASE CMAKE_INSTALL_DO_STRIP INCLUDE_INSTALL_DIR LIB_SUFFIX SHARE_INSTALL_PREFIX SYSCONF_INSTALL_DIR -- Build files have been written to: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 + /usr/bin/cmake --build redhat-linux-build_cuda-13 -j2 --verbose --target ggml-cuda Change Dir: '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' Run Build Command(s): /usr/bin/cmake -E env VERBOSE=1 /usr/bin/gmake -f Makefile -j2 ggml-cuda /usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 --check-build-system CMakeFiles/Makefile.cmake 0 /usr/bin/gmake -f CMakeFiles/Makefile2 ggml-cuda gmake[1]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' /usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 --check-build-system CMakeFiles/Makefile.cmake 0 /usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/CMakeFiles 47 /usr/bin/gmake -f CMakeFiles/Makefile2 ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/all gmake[2]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' /usr/bin/gmake -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/depend gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/DependInfo.cmake "--color=" gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' /usr/bin/gmake -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' [ 0%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o [ 2%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o -MF CMakeFiles/ggml-base.dir/ggml.c.o.d -o CMakeFiles/ggml-base.dir/ggml.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o -MF CMakeFiles/ggml-base.dir/ggml.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5663:13: warning: ‘ggml_hash_map_free’ defined but not used [-Wunused-function] 5663 | static void ggml_hash_map_free(struct hash_map * map) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5656:26: warning: ‘ggml_new_hash_map’ defined but not used [-Wunused-function] 5656 | static struct hash_map * ggml_new_hash_map(size_t size) { | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp:1: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ [ 4%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o -MF CMakeFiles/ggml-base.dir/ggml-alloc.c.o.d -o CMakeFiles/ggml-base.dir/ggml-alloc.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c:4: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ [ 4%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-backend.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp:14: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ [ 6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-opt.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp [ 6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-threading.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-threading.cpp [ 6%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o -MF CMakeFiles/ggml-base.dir/ggml-quants.c.o.d -o CMakeFiles/ggml-base.dir/ggml-quants.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp:6: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:4067:12: warning: ‘iq1_find_best_neighbour’ defined but not used [-Wunused-function] 4067 | static int iq1_find_best_neighbour(const uint16_t * GGML_RESTRICT neighbours, const uint64_t * GGML_RESTRICT grid, | ^~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:579:14: warning: ‘make_qkx1_quants’ defined but not used [-Wunused-function] 579 | static float make_qkx1_quants(int n, int nmax, const float * GGML_RESTRICT x, uint8_t * GGML_RESTRICT L, float * GGML_RESTRICT the_min, | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:5: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ [ 8%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o -MF CMakeFiles/ggml-base.dir/gguf.cpp.o.d -o CMakeFiles/ggml-base.dir/gguf.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp:3: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ [ 8%] Linking CXX shared library ../../../../../lib/ollama/libggml-base.so cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-base.dir/link.txt --verbose=1 /usr/bin/g++ -fPIC -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -Wl,--dependency-file=CMakeFiles/ggml-base.dir/link.d -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared -Wl,-soname,libggml-base.so -o ../../../../../lib/ollama/libggml-base.so "CMakeFiles/ggml-base.dir/ggml.c.o" "CMakeFiles/ggml-base.dir/ggml.cpp.o" "CMakeFiles/ggml-base.dir/ggml-alloc.c.o" "CMakeFiles/ggml-base.dir/ggml-backend.cpp.o" "CMakeFiles/ggml-base.dir/ggml-opt.cpp.o" "CMakeFiles/ggml-base.dir/ggml-threading.cpp.o" "CMakeFiles/ggml-base.dir/ggml-quants.c.o" "CMakeFiles/ggml-base.dir/gguf.cpp.o" -lm gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' [ 8%] Built target ggml-base /usr/bin/gmake -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/depend gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/DependInfo.cmake "--color=" gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' /usr/bin/gmake -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' [ 8%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o [ 8%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o -MF CMakeFiles/ggml-cuda.dir/acc.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/acc.cu -o CMakeFiles/ggml-cuda.dir/acc.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o -MF CMakeFiles/ggml-cuda.dir/add-id.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/add-id.cu -o CMakeFiles/ggml-cuda.dir/add-id.cu.o [ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o -MF CMakeFiles/ggml-cuda.dir/arange.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/arange.cu -o CMakeFiles/ggml-cuda.dir/arange.cu.o [ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o -MF CMakeFiles/ggml-cuda.dir/argmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argmax.cu -o CMakeFiles/ggml-cuda.dir/argmax.cu.o [ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o -MF CMakeFiles/ggml-cuda.dir/argsort.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argsort.cu -o CMakeFiles/ggml-cuda.dir/argsort.cu.o [ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o -MF CMakeFiles/ggml-cuda.dir/binbcast.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/binbcast.cu -o CMakeFiles/ggml-cuda.dir/binbcast.cu.o [ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o -MF CMakeFiles/ggml-cuda.dir/clamp.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/clamp.cu -o CMakeFiles/ggml-cuda.dir/clamp.cu.o [ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o -MF CMakeFiles/ggml-cuda.dir/concat.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/concat.cu -o CMakeFiles/ggml-cuda.dir/concat.cu.o [ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o -MF CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv-transpose-1d.cu -o CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o [ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-dw.cu -o CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o [ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-transpose.cu -o CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o [ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o -MF CMakeFiles/ggml-cuda.dir/convert.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/convert.cu -o CMakeFiles/ggml-cuda.dir/convert.cu.o [ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o -MF CMakeFiles/ggml-cuda.dir/count-equal.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/count-equal.cu -o CMakeFiles/ggml-cuda.dir/count-equal.cu.o [ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o -MF CMakeFiles/ggml-cuda.dir/cpy.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cpy.cu -o CMakeFiles/ggml-cuda.dir/cpy.cu.o [ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o -MF CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cross-entropy-loss.cu -o CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o [ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o -MF CMakeFiles/ggml-cuda.dir/diagmask.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/diagmask.cu -o CMakeFiles/ggml-cuda.dir/diagmask.cu.o [ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o [ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f32.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o [ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-wmma-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o [ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn.cu -o CMakeFiles/ggml-cuda.dir/fattn.cu.o [ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o -MF CMakeFiles/ggml-cuda.dir/getrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/getrows.cu -o CMakeFiles/ggml-cuda.dir/getrows.cu.o [ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o -MF CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu -o CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o [ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o -MF CMakeFiles/ggml-cuda.dir/gla.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/gla.cu -o CMakeFiles/ggml-cuda.dir/gla.cu.o [ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o -MF CMakeFiles/ggml-cuda.dir/im2col.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/im2col.cu -o CMakeFiles/ggml-cuda.dir/im2col.cu.o [ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o -MF CMakeFiles/ggml-cuda.dir/mean.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mean.cu -o CMakeFiles/ggml-cuda.dir/mean.cu.o [ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmf.cu -o CMakeFiles/ggml-cuda.dir/mmf.cu.o [ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmq.cu -o CMakeFiles/ggml-cuda.dir/mmq.cu.o [ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvf.cu -o CMakeFiles/ggml-cuda.dir/mmvf.cu.o [ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvq.cu -o CMakeFiles/ggml-cuda.dir/mmvq.cu.o [ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o -MF CMakeFiles/ggml-cuda.dir/norm.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/norm.cu -o CMakeFiles/ggml-cuda.dir/norm.cu.o [ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o -MF CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/opt-step-adamw.cu -o CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o [ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o -MF CMakeFiles/ggml-cuda.dir/out-prod.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/out-prod.cu -o CMakeFiles/ggml-cuda.dir/out-prod.cu.o [ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o -MF CMakeFiles/ggml-cuda.dir/pad.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pad.cu -o CMakeFiles/ggml-cuda.dir/pad.cu.o [ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o -MF CMakeFiles/ggml-cuda.dir/pool2d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pool2d.cu -o CMakeFiles/ggml-cuda.dir/pool2d.cu.o [ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o -MF CMakeFiles/ggml-cuda.dir/quantize.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/quantize.cu -o CMakeFiles/ggml-cuda.dir/quantize.cu.o [ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o -MF CMakeFiles/ggml-cuda.dir/roll.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/roll.cu -o CMakeFiles/ggml-cuda.dir/roll.cu.o [ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o -MF CMakeFiles/ggml-cuda.dir/rope.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/rope.cu -o CMakeFiles/ggml-cuda.dir/rope.cu.o [ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o -MF CMakeFiles/ggml-cuda.dir/scale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/scale.cu -o CMakeFiles/ggml-cuda.dir/scale.cu.o [ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o -MF CMakeFiles/ggml-cuda.dir/set-rows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/set-rows.cu -o CMakeFiles/ggml-cuda.dir/set-rows.cu.o [ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o -MF CMakeFiles/ggml-cuda.dir/softcap.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softcap.cu -o CMakeFiles/ggml-cuda.dir/softcap.cu.o [ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o -MF CMakeFiles/ggml-cuda.dir/softmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softmax.cu -o CMakeFiles/ggml-cuda.dir/softmax.cu.o [ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-conv.cu -o CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o [ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-scan.cu -o CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o [ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o -MF CMakeFiles/ggml-cuda.dir/sum.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sum.cu -o CMakeFiles/ggml-cuda.dir/sum.cu.o [ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o -MF CMakeFiles/ggml-cuda.dir/sumrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sumrows.cu -o CMakeFiles/ggml-cuda.dir/sumrows.cu.o [ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o -MF CMakeFiles/ggml-cuda.dir/tsembd.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/tsembd.cu -o CMakeFiles/ggml-cuda.dir/tsembd.cu.o [ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o -MF CMakeFiles/ggml-cuda.dir/unary.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/unary.cu -o CMakeFiles/ggml-cuda.dir/unary.cu.o [ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o -MF CMakeFiles/ggml-cuda.dir/upscale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/upscale.cu -o CMakeFiles/ggml-cuda.dir/upscale.cu.o [ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o -MF CMakeFiles/ggml-cuda.dir/wkv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/wkv.cu -o CMakeFiles/ggml-cuda.dir/wkv.cu.o [ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o [ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o [ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o [ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o [ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o [ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o [ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o [ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o [ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o [ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o [ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o [ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o [ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o [ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o [ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o [ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o [ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o [ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o [ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o [ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq1_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o [ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o [ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o [ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o [ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o [ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o [ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_nl.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o [ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o [ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-mxfp4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o [ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q2_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o [ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q3_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o [ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o [ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o [ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o [ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o [ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o [ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o [ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q6_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o [ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o [ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o [ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o [ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o [ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o [ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o [ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o [ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o [ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o [ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o [ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o [100%] Linking CUDA shared module ../../../../../../lib/ollama/libggml-cuda.so cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-cuda.dir/link.txt --verbose=1 /usr/bin/g++ -fPIC -Wl,--dependency-file=CMakeFiles/ggml-cuda.dir/link.d -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared -o ../../../../../../lib/ollama/libggml-cuda.so @CMakeFiles/ggml-cuda.dir/objects1.rsp @CMakeFiles/ggml-cuda.dir/linkLibs.rsp -L"/usr/local/cuda-13/targets/x86_64-linux/lib/stubs" -L"/usr/local/cuda-13/targets/x86_64-linux/lib" gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' [100%] Built target ggml-cuda gmake[2]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' /usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/CMakeFiles 0 gmake[1]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' + CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CXXFLAGS + FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=gcc + export CC + CXX=g++ + export CXX + /usr/bin/cmake -S . -B redhat-linux-build_cuda-12 -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DCMAKE_INSTALL_FULL_SBINDIR:PATH=/usr/bin -DCMAKE_INSTALL_SBINDIR:PATH=bin -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON --preset 'CUDA 12' -DOLLAMA_RUNNER_DIR=cuda_v12 -DCMAKE_CUDA_COMPILER=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -DCMAKE_CUDA_HOST_COMPILER=g++-14 -DCMAKE_CUDA_FLAGS_RELEASE=-DNDEBUG '-DCMAKE_CUDA_FLAGS=-O2 -g -Xcompiler "-fPIC"' Preset CMake variables: CMAKE_BUILD_TYPE="Release" CMAKE_CUDA_ARCHITECTURES="50;60;61;70;75;80;86;87;89;90;90a;120" CMAKE_MSVC_RUNTIME_LIBRARY="MultiThreaded" -- The C compiler identification is GNU 15.2.1 -- The CXX compiler identification is GNU 15.2.1 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/gcc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/g++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF -- CMAKE_SYSTEM_PROCESSOR: x86_64 -- GGML_SYSTEM_ARCH: x86 -- Including CPU backend -- x86 detected -- Adding CPU backend variant ggml-cpu-x64: -- x86 detected -- Adding CPU backend variant ggml-cpu-sse42: -msse4.2 GGML_SSE42 -- x86 detected -- Adding CPU backend variant ggml-cpu-sandybridge: -msse4.2;-mavx GGML_SSE42;GGML_AVX -- x86 detected -- Adding CPU backend variant ggml-cpu-haswell: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2 GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2 -- x86 detected -- Adding CPU backend variant ggml-cpu-skylakex: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512 -- x86 detected -- Adding CPU backend variant ggml-cpu-icelake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw;-mavx512vbmi;-mavx512vnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512;GGML_AVX512_VBMI;GGML_AVX512_VNNI -- x86 detected -- Adding CPU backend variant ggml-cpu-alderlake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavxvnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX_VNNI -- Found CUDAToolkit: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/include (found version "12.9.86") -- CUDA Toolkit found -- Using CUDA architectures: 50;60;61;70;75;80;86;87;89;90;90a;120 -- The CUDA compiler identification is NVIDIA 12.9.86 with host compiler GNU 14.2.1 -- Detecting CUDA compiler ABI info -- Detecting CUDA compiler ABI info - done -- Check for working CUDA compiler: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc - skipped -- Detecting CUDA compile features -- Detecting CUDA compile features - done -- Looking for a HIP compiler -- Looking for a HIP compiler - NOTFOUND -- Configuring done (8.3s) -- Generating done (0.1s) CMake Warning: Manually-specified variables were not used by the project: CMAKE_Fortran_FLAGS_RELEASE CMAKE_INSTALL_DO_STRIP INCLUDE_INSTALL_DIR LIB_SUFFIX SHARE_INSTALL_PREFIX SYSCONF_INSTALL_DIR -- Build files have been written to: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 + /usr/bin/cmake --build redhat-linux-build_cuda-12 -j2 --verbose --target ggml-cuda Change Dir: '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' Run Build Command(s): /usr/bin/cmake -E env VERBOSE=1 /usr/bin/gmake -f Makefile -j2 ggml-cuda /usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 --check-build-system CMakeFiles/Makefile.cmake 0 /usr/bin/gmake -f CMakeFiles/Makefile2 ggml-cuda gmake[1]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' /usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 --check-build-system CMakeFiles/Makefile.cmake 0 /usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/CMakeFiles 47 /usr/bin/gmake -f CMakeFiles/Makefile2 ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/all gmake[2]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' /usr/bin/gmake -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/depend gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/DependInfo.cmake "--color=" gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' /usr/bin/gmake -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' [ 2%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o -MF CMakeFiles/ggml-base.dir/ggml.c.o.d -o CMakeFiles/ggml-base.dir/ggml.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c [ 2%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o -MF CMakeFiles/ggml-base.dir/ggml.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5663:13: warning: ‘ggml_hash_map_free’ defined but not used [-Wunused-function] 5663 | static void ggml_hash_map_free(struct hash_map * map) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5656:26: warning: ‘ggml_new_hash_map’ defined but not used [-Wunused-function] 5656 | static struct hash_map * ggml_new_hash_map(size_t size) { | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp:1: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ [ 4%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o -MF CMakeFiles/ggml-base.dir/ggml-alloc.c.o.d -o CMakeFiles/ggml-base.dir/ggml-alloc.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c:4: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ [ 4%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-backend.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp:14: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ [ 6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-opt.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp [ 6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-threading.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-threading.cpp [ 6%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o -MF CMakeFiles/ggml-base.dir/ggml-quants.c.o.d -o CMakeFiles/ggml-base.dir/ggml-quants.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp:6: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:4067:12: warning: ‘iq1_find_best_neighbour’ defined but not used [-Wunused-function] 4067 | static int iq1_find_best_neighbour(const uint16_t * GGML_RESTRICT neighbours, const uint64_t * GGML_RESTRICT grid, | ^~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:579:14: warning: ‘make_qkx1_quants’ defined but not used [-Wunused-function] 579 | static float make_qkx1_quants(int n, int nmax, const float * GGML_RESTRICT x, uint8_t * GGML_RESTRICT L, float * GGML_RESTRICT the_min, | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:5: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ [ 8%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o -MF CMakeFiles/ggml-base.dir/gguf.cpp.o.d -o CMakeFiles/ggml-base.dir/gguf.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp:3: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ [ 8%] Linking CXX shared library ../../../../../lib/ollama/libggml-base.so cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-base.dir/link.txt --verbose=1 /usr/bin/g++ -fPIC -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -Wl,--dependency-file=CMakeFiles/ggml-base.dir/link.d -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared -Wl,-soname,libggml-base.so -o ../../../../../lib/ollama/libggml-base.so "CMakeFiles/ggml-base.dir/ggml.c.o" "CMakeFiles/ggml-base.dir/ggml.cpp.o" "CMakeFiles/ggml-base.dir/ggml-alloc.c.o" "CMakeFiles/ggml-base.dir/ggml-backend.cpp.o" "CMakeFiles/ggml-base.dir/ggml-opt.cpp.o" "CMakeFiles/ggml-base.dir/ggml-threading.cpp.o" "CMakeFiles/ggml-base.dir/ggml-quants.c.o" "CMakeFiles/ggml-base.dir/gguf.cpp.o" -lm gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' [ 8%] Built target ggml-base /usr/bin/gmake -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/depend gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/DependInfo.cmake "--color=" gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' /usr/bin/gmake -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' [ 8%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o [ 8%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o -MF CMakeFiles/ggml-cuda.dir/acc.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/acc.cu -o CMakeFiles/ggml-cuda.dir/acc.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o -MF CMakeFiles/ggml-cuda.dir/add-id.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/add-id.cu -o CMakeFiles/ggml-cuda.dir/add-id.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o -MF CMakeFiles/ggml-cuda.dir/arange.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/arange.cu -o CMakeFiles/ggml-cuda.dir/arange.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o -MF CMakeFiles/ggml-cuda.dir/argmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argmax.cu -o CMakeFiles/ggml-cuda.dir/argmax.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o -MF CMakeFiles/ggml-cuda.dir/argsort.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argsort.cu -o CMakeFiles/ggml-cuda.dir/argsort.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o -MF CMakeFiles/ggml-cuda.dir/binbcast.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/binbcast.cu -o CMakeFiles/ggml-cuda.dir/binbcast.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o -MF CMakeFiles/ggml-cuda.dir/clamp.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/clamp.cu -o CMakeFiles/ggml-cuda.dir/clamp.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o -MF CMakeFiles/ggml-cuda.dir/concat.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/concat.cu -o CMakeFiles/ggml-cuda.dir/concat.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o -MF CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv-transpose-1d.cu -o CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-dw.cu -o CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-transpose.cu -o CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o -MF CMakeFiles/ggml-cuda.dir/convert.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/convert.cu -o CMakeFiles/ggml-cuda.dir/convert.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o -MF CMakeFiles/ggml-cuda.dir/count-equal.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/count-equal.cu -o CMakeFiles/ggml-cuda.dir/count-equal.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o -MF CMakeFiles/ggml-cuda.dir/cpy.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cpy.cu -o CMakeFiles/ggml-cuda.dir/cpy.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o -MF CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cross-entropy-loss.cu -o CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o -MF CMakeFiles/ggml-cuda.dir/diagmask.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/diagmask.cu -o CMakeFiles/ggml-cuda.dir/diagmask.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f32.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-wmma-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn.cu -o CMakeFiles/ggml-cuda.dir/fattn.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o -MF CMakeFiles/ggml-cuda.dir/getrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/getrows.cu -o CMakeFiles/ggml-cuda.dir/getrows.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o -MF CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu -o CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o -MF CMakeFiles/ggml-cuda.dir/gla.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/gla.cu -o CMakeFiles/ggml-cuda.dir/gla.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o -MF CMakeFiles/ggml-cuda.dir/im2col.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/im2col.cu -o CMakeFiles/ggml-cuda.dir/im2col.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o -MF CMakeFiles/ggml-cuda.dir/mean.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mean.cu -o CMakeFiles/ggml-cuda.dir/mean.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmf.cu -o CMakeFiles/ggml-cuda.dir/mmf.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmq.cu -o CMakeFiles/ggml-cuda.dir/mmq.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvf.cu -o CMakeFiles/ggml-cuda.dir/mmvf.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvq.cu -o CMakeFiles/ggml-cuda.dir/mmvq.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o -MF CMakeFiles/ggml-cuda.dir/norm.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/norm.cu -o CMakeFiles/ggml-cuda.dir/norm.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o -MF CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/opt-step-adamw.cu -o CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o -MF CMakeFiles/ggml-cuda.dir/out-prod.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/out-prod.cu -o CMakeFiles/ggml-cuda.dir/out-prod.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o -MF CMakeFiles/ggml-cuda.dir/pad.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pad.cu -o CMakeFiles/ggml-cuda.dir/pad.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o -MF CMakeFiles/ggml-cuda.dir/pool2d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pool2d.cu -o CMakeFiles/ggml-cuda.dir/pool2d.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o -MF CMakeFiles/ggml-cuda.dir/quantize.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/quantize.cu -o CMakeFiles/ggml-cuda.dir/quantize.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o -MF CMakeFiles/ggml-cuda.dir/roll.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/roll.cu -o CMakeFiles/ggml-cuda.dir/roll.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o -MF CMakeFiles/ggml-cuda.dir/rope.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/rope.cu -o CMakeFiles/ggml-cuda.dir/rope.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o -MF CMakeFiles/ggml-cuda.dir/scale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/scale.cu -o CMakeFiles/ggml-cuda.dir/scale.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o -MF CMakeFiles/ggml-cuda.dir/set-rows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/set-rows.cu -o CMakeFiles/ggml-cuda.dir/set-rows.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o -MF CMakeFiles/ggml-cuda.dir/softcap.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softcap.cu -o CMakeFiles/ggml-cuda.dir/softcap.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o -MF CMakeFiles/ggml-cuda.dir/softmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softmax.cu -o CMakeFiles/ggml-cuda.dir/softmax.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-conv.cu -o CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-scan.cu -o CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o -MF CMakeFiles/ggml-cuda.dir/sum.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sum.cu -o CMakeFiles/ggml-cuda.dir/sum.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o -MF CMakeFiles/ggml-cuda.dir/sumrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sumrows.cu -o CMakeFiles/ggml-cuda.dir/sumrows.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o -MF CMakeFiles/ggml-cuda.dir/tsembd.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/tsembd.cu -o CMakeFiles/ggml-cuda.dir/tsembd.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o -MF CMakeFiles/ggml-cuda.dir/unary.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/unary.cu -o CMakeFiles/ggml-cuda.dir/unary.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o -MF CMakeFiles/ggml-cuda.dir/upscale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/upscale.cu -o CMakeFiles/ggml-cuda.dir/upscale.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o -MF CMakeFiles/ggml-cuda.dir/wkv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/wkv.cu -o CMakeFiles/ggml-cuda.dir/wkv.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq1_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_nl.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-mxfp4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q2_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q3_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q6_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [100%] Linking CUDA shared module ../../../../../../lib/ollama/libggml-cuda.so cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-cuda.dir/link.txt --verbose=1 /usr/bin/g++-14 -fPIC -Wl,--dependency-file=CMakeFiles/ggml-cuda.dir/link.d -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared -o ../../../../../../lib/ollama/libggml-cuda.so @CMakeFiles/ggml-cuda.dir/objects1.rsp @CMakeFiles/ggml-cuda.dir/linkLibs.rsp -L"/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/lib/stubs" -L"/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/lib" gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' [100%] Built target ggml-cuda gmake[2]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' /usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/CMakeFiles 0 gmake[1]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' + RPM_EC=0 ++ jobs -p + exit 0 Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.3nX7KD + umask 022 + cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + '[' /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT '!=' / ']' + rm -rf /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT ++ dirname /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT + mkdir -p /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + mkdir /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT + CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CXXFLAGS + FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=gcc + export CC + CXX=g++ + export CXX + cd ollama-0.12.3 + DESTDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT + /usr/bin/cmake --install redhat-linux-build_cuda-13 --component CUDA -- Install configuration: "Release" -- Installing: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v13/libggml-cuda.so -- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v13/libggml-cuda.so" to "" + DESTDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT + /usr/bin/cmake --install redhat-linux-build_cuda-12 --component CUDA -- Install configuration: "Release" -- Installing: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v12/libggml-cuda.so -- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v12/libggml-cuda.so" to "" + /usr/bin/find-debuginfo -j2 --strict-build-id -m -i --build-id-seed 0.12.3-1.fc42 --unique-debug-suffix -0.12.3-1.fc42.x86_64 --unique-debug-src-base ollama-ggml-cuda-0.12.3-1.fc42.x86_64 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 110000000 -S debugsourcefiles.list /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 find-debuginfo: starting Extracting debug info from 2 files DWARF-compressing 2 files sepdebugcrcfix: Updated 2 CRC32s, 0 CRC32s did match. Creating .debug symlinks for symlinks to ELF files Copying sources found by 'debugedit -l' to /usr/src/debug/ollama-ggml-cuda-0.12.3-1.fc42.x86_64 find-debuginfo: done + /usr/lib/rpm/check-buildroot + /usr/lib/rpm/redhat/brp-ldconfig + /usr/lib/rpm/brp-compress + /usr/lib/rpm/redhat/brp-strip-lto /usr/bin/strip + /usr/lib/rpm/brp-strip-static-archive /usr/bin/strip + /usr/lib/rpm/check-rpaths + /usr/lib/rpm/redhat/brp-mangle-shebangs + /usr/lib/rpm/brp-remove-la-files + env /usr/lib/rpm/redhat/brp-python-bytecompile '' 1 0 -j2 + /usr/lib/rpm/redhat/brp-python-hardlink + /usr/bin/add-determinism --brp -j2 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT Scanned 39 directories and 162 files, processed 0 inodes, 0 modified (0 replaced + 0 rewritten), 0 unsupported format, 0 errors Reading /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/SPECPARTS/rpm-debuginfo.specpart Processing files: ollama-ggml-cuda-13-0.12.3-1.fc42.x86_64 Executing(%license): /bin/sh -e /var/tmp/rpm-tmp.dFmKNX + umask 022 + cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + cd ollama-0.12.3 + LICENSEDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-13 + export LC_ALL=C.UTF-8 + LC_ALL=C.UTF-8 + export LICENSEDIR + /usr/bin/mkdir -p /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-13 + cp -pr /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/LICENSE /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-13 + RPM_EC=0 ++ jobs -p + exit 0 Provides: libggml-cuda.so()(64bit) ollama-ggml-cuda-13 = 0.12.3-1.fc42 ollama-ggml-cuda-13(x86-64) = 0.12.3-1.fc42 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: libc.so.6()(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_ABI_DT_RELR)(64bit) libcublas.so.13()(64bit) libcublas.so.13(libcublas.so.13)(64bit) libcuda.so.1()(64bit) libcudart.so.13()(64bit) libcudart.so.13(libcudart.so.13)(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libm.so.6()(64bit) libm.so.6(GLIBC_2.27)(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.30)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) rtld(GNU_HASH) Supplements: if libcublas-13-0 ollama-ggml Processing files: ollama-ggml-cuda-12-0.12.3-1.fc42.x86_64 Executing(%license): /bin/sh -e /var/tmp/rpm-tmp.vKWo2M + umask 022 + cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + cd ollama-0.12.3 + LICENSEDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-12 + export LC_ALL=C.UTF-8 + LC_ALL=C.UTF-8 + export LICENSEDIR + /usr/bin/mkdir -p /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-12 + cp -pr /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/LICENSE /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-12 + RPM_EC=0 ++ jobs -p + exit 0 Provides: libggml-cuda.so()(64bit) ollama-ggml-cuda-12 = 0.12.3-1.fc42 ollama-ggml-cuda-12(x86-64) = 0.12.3-1.fc42 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: libc.so.6()(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_ABI_DT_RELR)(64bit) libcublas.so.12()(64bit) libcublas.so.12(libcublas.so.12)(64bit) libcuda.so.1()(64bit) libcudart.so.12()(64bit) libcudart.so.12(libcudart.so.12)(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libm.so.6()(64bit) libm.so.6(GLIBC_2.27)(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.30)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) rtld(GNU_HASH) Supplements: if libcublas-12-9 ollama-ggml Processing files: ollama-ggml-cuda-debugsource-0.12.3-1.fc42.x86_64 Provides: ollama-ggml-cuda-debugsource = 0.12.3-1.fc42 ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc42 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Processing files: ollama-ggml-cuda-debuginfo-0.12.3-1.fc42.x86_64 Provides: ollama-ggml-cuda-debuginfo = 0.12.3-1.fc42 ollama-ggml-cuda-debuginfo(x86-64) = 0.12.3-1.fc42 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Recommends: ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc42 Processing files: ollama-ggml-cuda-13-debuginfo-0.12.3-1.fc42.x86_64 Provides: debuginfo(build-id) = bd6ea8e8019ee241b27fbb38c682387faed925ac libggml-cuda.so-0.12.3-1.fc42.x86_64.debug()(64bit) ollama-ggml-cuda-13-debuginfo = 0.12.3-1.fc42 ollama-ggml-cuda-13-debuginfo(x86-64) = 0.12.3-1.fc42 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Recommends: ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc42 Processing files: ollama-ggml-cuda-12-debuginfo-0.12.3-1.fc42.x86_64 Provides: debuginfo(build-id) = f923ca8326c9ddd427231ebcac67552682233b1d libggml-cuda.so-0.12.3-1.fc42.x86_64.debug()(64bit) ollama-ggml-cuda-12-debuginfo = 0.12.3-1.fc42 ollama-ggml-cuda-12-debuginfo(x86-64) = 0.12.3-1.fc42 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Recommends: ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc42 Checking for unpackaged file(s): /usr/lib/rpm/check-files /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT Wrote: /builddir/build/RPMS/ollama-ggml-cuda-13-0.12.3-1.fc42.x86_64.rpm Wrote: /builddir/build/RPMS/ollama-ggml-cuda-13-debuginfo-0.12.3-1.fc42.x86_64.rpm Wrote: /builddir/build/RPMS/ollama-ggml-cuda-12-debuginfo-0.12.3-1.fc42.x86_64.rpm Wrote: /builddir/build/RPMS/ollama-ggml-cuda-debugsource-0.12.3-1.fc42.x86_64.rpm Wrote: /builddir/build/RPMS/ollama-ggml-cuda-debuginfo-0.12.3-1.fc42.x86_64.rpm Wrote: /builddir/build/RPMS/ollama-ggml-cuda-12-0.12.3-1.fc42.x86_64.rpm Executing(rmbuild): /bin/sh -e /var/tmp/rpm-tmp.5p4cC3 + umask 022 + cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + test -d /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + rm -rf /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + RPM_EC=0 ++ jobs -p + exit 0 Finish: rpmbuild ollama-ggml-cuda-0.12.3-1.fc42.src.rpm Finish: build phase for ollama-ggml-cuda-0.12.3-1.fc42.src.rpm INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-42-x86_64-1759428480.475249/root/var/log/dnf5.log INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz /bin/tar: Removing leading `/' from member names INFO: Done(/var/lib/copr-rpmbuild/results/ollama-ggml-cuda-0.12.3-1.fc42.src.rpm) Config(child) 179 minutes 33 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot Finish: run Running RPMResults tool Package info: { "packages": [ { "name": "ollama-ggml-cuda-12", "epoch": null, "version": "0.12.3", "release": "1.fc42", "arch": "x86_64" }, { "name": "ollama-ggml-cuda-debuginfo", "epoch": null, "version": "0.12.3", "release": "1.fc42", "arch": "x86_64" }, { "name": "ollama-ggml-cuda", "epoch": null, "version": "0.12.3", "release": "1.fc42", "arch": "src" }, { "name": "ollama-ggml-cuda-debugsource", "epoch": null, "version": "0.12.3", "release": "1.fc42", "arch": "x86_64" }, { "name": "ollama-ggml-cuda-13-debuginfo", "epoch": null, "version": "0.12.3", "release": "1.fc42", "arch": "x86_64" }, { "name": "ollama-ggml-cuda-12-debuginfo", "epoch": null, "version": "0.12.3", "release": "1.fc42", "arch": "x86_64" }, { "name": "ollama-ggml-cuda-13", "epoch": null, "version": "0.12.3", "release": "1.fc42", "arch": "x86_64" } ] } RPMResults finished