Warning: Permanently added '2620:52:3:1:dead:beef:cafe:c116' (ED25519) to the list of known hosts.

You can reproduce this build on your computer by running:

  sudo dnf install copr-rpmbuild
  /usr/bin/copr-rpmbuild --verbose --drop-resultdir --task-url https://copr.fedorainfracloud.org/backend/get-build-task/9640748-fedora-42-x86_64 --chroot fedora-42-x86_64


Version: 1.6
PID: 2248
Logging PID: 2250
Task:
{'allow_user_ssh': False,
 'appstream': False,
 'background': False,
 'build_id': 9640748,
 'buildroot_pkgs': [],
 'chroot': 'fedora-42-x86_64',
 'enable_net': False,
 'fedora_review': False,
 'git_hash': '38a37bebc7b1ab1ef3d8eb11e3541d2494224ffd',
 'git_repo': 'https://copr-dist-git.fedorainfracloud.org/git/fachep/ollama/ollama-ggml-cuda',
 'isolation': 'default',
 'memory_reqs': 2048,
 'package_name': 'ollama-ggml-cuda',
 'package_version': '0.12.3-1',
 'project_dirname': 'ollama',
 'project_name': 'ollama',
 'project_owner': 'fachep',
 'repo_priority': None,
 'repos': [{'baseurl': 'https://download.copr.fedorainfracloud.org/results/fachep/ollama/fedora-42-x86_64/',
            'id': 'copr_base',
            'name': 'Copr repository',
            'priority': None},
           {'baseurl': 'https://developer.download.nvidia.cn/compute/cuda/repos/fedora42/x86_64/',
            'id': 'https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64',
            'name': 'Additional repo https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64'},
           {'baseurl': 'https://developer.download.nvidia.cn/compute/cuda/repos/fedora41/x86_64/',
            'id': 'https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64',
            'name': 'Additional repo https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64'}],
 'sandbox': 'fachep/ollama--fachep',
 'source_json': {},
 'source_type': None,
 'ssh_public_keys': None,
 'storage': 0,
 'submitter': 'fachep',
 'tags': [],
 'task_id': '9640748-fedora-42-x86_64',
 'timeout': 18000,
 'uses_devel_repo': False,
 'with_opts': [],
 'without_opts': []}

Running: git clone https://copr-dist-git.fedorainfracloud.org/git/fachep/ollama/ollama-ggml-cuda /var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda --depth 500 --no-single-branch --recursive

cmd: ['git', 'clone', 'https://copr-dist-git.fedorainfracloud.org/git/fachep/ollama/ollama-ggml-cuda', '/var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda', '--depth', '500', '--no-single-branch', '--recursive']
cwd: .
rc: 0
stdout: 
stderr: Cloning into '/var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda'...

Running: git checkout 38a37bebc7b1ab1ef3d8eb11e3541d2494224ffd --

cmd: ['git', 'checkout', '38a37bebc7b1ab1ef3d8eb11e3541d2494224ffd', '--']
cwd: /var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda
rc: 0
stdout: 
stderr: Note: switching to '38a37bebc7b1ab1ef3d8eb11e3541d2494224ffd'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 38a37be automatic import of ollama-ggml-cuda

Running: dist-git-client sources

cmd: ['dist-git-client', 'sources']
cwd: /var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda
rc: 0
stdout: 
stderr: INFO: Reading stdout from command: git rev-parse --abbrev-ref HEAD
INFO: Reading stdout from command: git rev-parse HEAD
INFO: Reading sources specification file: sources
INFO: Downloading v0.12.3.tar.gz
INFO: Reading stdout from command: curl --help all
INFO: Calling: curl -H Pragma: -o v0.12.3.tar.gz --location --connect-timeout 60 --retry 3 --retry-delay 10 --remote-time --show-error --fail --retry-all-errors https://copr-dist-git.fedorainfracloud.org/repo/pkgs/fachep/ollama/ollama-ggml-cuda/v0.12.3.tar.gz/md5/f096acee5e82596e9afd4d07ed477de2/v0.12.3.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 10.5M  100 10.5M    0     0  24.3M      0 --:--:-- --:--:-- --:--:-- 24.3M
INFO: Reading stdout from command: md5sum v0.12.3.tar.gz

tail: /var/lib/copr-rpmbuild/main.log: file truncated
Running (timeout=18000): unbuffer mock --spec /var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda/ollama-ggml-cuda.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1759428480.475249 -r /var/lib/copr-rpmbuild/results/configs/child.cfg
INFO: mock.py version 6.3 starting (python version = 3.13.7, NVR = mock-6.3-1.fc42), args: /usr/libexec/mock/mock --spec /var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda/ollama-ggml-cuda.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1759428480.475249 -r /var/lib/copr-rpmbuild/results/configs/child.cfg
Start(bootstrap): init plugins
INFO: tmpfs initialized
INFO: selinux enabled
INFO: chroot_scan: initialized
INFO: compress_logs: initialized
Finish(bootstrap): init plugins
Start: init plugins
INFO: tmpfs initialized
INFO: selinux enabled
INFO: chroot_scan: initialized
INFO: compress_logs: initialized
Finish: init plugins
INFO: Signal handler active
Start: run
INFO: Start(/var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda/ollama-ggml-cuda.spec)  Config(fedora-42-x86_64)
Start: clean chroot
Finish: clean chroot
Mock Version: 6.3
INFO: Mock Version: 6.3
Start(bootstrap): chroot init
INFO: mounting tmpfs at /var/lib/mock/fedora-42-x86_64-bootstrap-1759428480.475249/root.
INFO: calling preinit hooks
INFO: enabled root cache
INFO: enabled package manager cache
Start(bootstrap): cleaning package manager metadata
Finish(bootstrap): cleaning package manager metadata
INFO: Guessed host environment type: unknown
INFO: Using container image: registry.fedoraproject.org/fedora:42
INFO: Pulling image: registry.fedoraproject.org/fedora:42
INFO: Tagging container image as mock-bootstrap-39efaca6-2eaa-4e95-b733-a8ff9aab69a2
INFO: Checking that a957e8565081588d951e7d03e0623a69ff0e5191d672b63a6a58f06e615c432c image matches host's architecture
INFO: Copy content of container a957e8565081588d951e7d03e0623a69ff0e5191d672b63a6a58f06e615c432c to /var/lib/mock/fedora-42-x86_64-bootstrap-1759428480.475249/root
INFO: mounting a957e8565081588d951e7d03e0623a69ff0e5191d672b63a6a58f06e615c432c with podman image mount
INFO: image a957e8565081588d951e7d03e0623a69ff0e5191d672b63a6a58f06e615c432c as /var/lib/containers/storage/overlay/83d7cea453f45be652d7781a7bdec8b1d322b41307e4455bbbfec597db48be36/merged
INFO: umounting image a957e8565081588d951e7d03e0623a69ff0e5191d672b63a6a58f06e615c432c (/var/lib/containers/storage/overlay/83d7cea453f45be652d7781a7bdec8b1d322b41307e4455bbbfec597db48be36/merged) with podman image umount
INFO: Removing image mock-bootstrap-39efaca6-2eaa-4e95-b733-a8ff9aab69a2
INFO: Package manager dnf5 detected and used (fallback)
INFO: Not updating bootstrap chroot, bootstrap_image_ready=True
Start(bootstrap): creating root cache
Finish(bootstrap): creating root cache
Finish(bootstrap): chroot init
Start: chroot init
INFO: mounting tmpfs at /var/lib/mock/fedora-42-x86_64-1759428480.475249/root.
INFO: calling preinit hooks
INFO: enabled root cache
INFO: enabled package manager cache
Start: cleaning package manager metadata
Finish: cleaning package manager metadata
INFO: enabled HW Info plugin
INFO: Package manager dnf5 detected and used (direct choice)
INFO: Buildroot is handled by package management downloaded with a bootstrap image:
  rpm-4.20.1-1.fc42.x86_64
  rpm-sequoia-1.7.0-5.fc42.x86_64
  dnf5-5.2.16.0-1.fc42.x86_64
  dnf5-plugins-5.2.16.0-1.fc42.x86_64
Start: installing minimal buildroot with dnf5
Updating and loading repositories:
 Copr repository                        100% |   3.1 KiB/s |   1.6 KiB |  00m01s
 Additional repo https_developer_downlo 100% |  67.7 KiB/s |  47.8 KiB |  00m01s
 Additional repo https_developer_downlo 100% | 141.7 KiB/s | 109.0 KiB |  00m01s
 fedora                                 100% |  11.4 MiB/s |  35.4 MiB |  00m03s
 updates                                100% | 799.9 KiB/s |  10.3 MiB |  00m13s
Repositories loaded.
Package                            Arch   Version                     Repository      Size
Installing group/module packages:
 bash                              x86_64 5.2.37-1.fc42               fedora       8.2 MiB
 bzip2                             x86_64 1.0.8-20.fc42               fedora      99.3 KiB
 coreutils                         x86_64 9.6-6.fc42                  updates      5.4 MiB
 cpio                              x86_64 2.15-4.fc42                 fedora       1.1 MiB
 diffutils                         x86_64 3.12-1.fc42                 updates      1.6 MiB
 fedora-release-common             noarch 42-30                       updates     20.2 KiB
 findutils                         x86_64 1:4.10.0-5.fc42             fedora       1.9 MiB
 gawk                              x86_64 5.3.1-1.fc42                fedora       1.7 MiB
 glibc-minimal-langpack            x86_64 2.41-11.fc42                updates      0.0   B
 grep                              x86_64 3.11-10.fc42                fedora       1.0 MiB
 gzip                              x86_64 1.13-3.fc42                 fedora     392.9 KiB
 info                              x86_64 7.2-3.fc42                  fedora     357.9 KiB
 patch                             x86_64 2.8-1.fc42                  updates    222.8 KiB
 redhat-rpm-config                 noarch 342-4.fc42                  updates    185.5 KiB
 rpm-build                         x86_64 4.20.1-1.fc42               fedora     168.7 KiB
 sed                               x86_64 4.9-4.fc42                  fedora     857.3 KiB
 shadow-utils                      x86_64 2:4.17.4-1.fc42             fedora       4.0 MiB
 tar                               x86_64 2:1.35-5.fc42               fedora       3.0 MiB
 unzip                             x86_64 6.0-66.fc42                 fedora     390.3 KiB
 util-linux                        x86_64 2.40.4-7.fc42               fedora       3.4 MiB
 which                             x86_64 2.23-2.fc42                 updates     83.5 KiB
 xz                                x86_64 1:5.8.1-2.fc42              updates      1.3 MiB
Installing dependencies:
 add-determinism                   x86_64 0.6.0-1.fc42                fedora       2.5 MiB
 alternatives                      x86_64 1.33-1.fc42                 updates     62.2 KiB
 ansible-srpm-macros               noarch 1-17.1.fc42                 fedora      35.7 KiB
 audit-libs                        x86_64 4.1.1-1.fc42                updates    378.8 KiB
 basesystem                        noarch 11-22.fc42                  fedora       0.0   B
 binutils                          x86_64 2.44-6.fc42                 updates     25.8 MiB
 build-reproducibility-srpm-macros noarch 0.6.0-1.fc42                fedora     735.0   B
 bzip2-libs                        x86_64 1.0.8-20.fc42               fedora      84.6 KiB
 ca-certificates                   noarch 2025.2.80_v9.0.304-1.0.fc42 updates      2.7 MiB
 coreutils-common                  x86_64 9.6-6.fc42                  updates     11.1 MiB
 crypto-policies                   noarch 20250707-1.gitad370a8.fc42  updates    142.9 KiB
 curl                              x86_64 8.11.1-6.fc42               updates    450.6 KiB
 cyrus-sasl-lib                    x86_64 2.1.28-30.fc42              fedora       2.3 MiB
 debugedit                         x86_64 5.1-7.fc42                  updates    192.7 KiB
 dwz                               x86_64 0.16-1.fc42                 updates    287.1 KiB
 ed                                x86_64 1.21-2.fc42                 fedora     146.5 KiB
 efi-srpm-macros                   noarch 6-3.fc42                    updates     40.1 KiB
 elfutils                          x86_64 0.193-2.fc42                updates      2.9 MiB
 elfutils-debuginfod-client        x86_64 0.193-2.fc42                updates     83.9 KiB
 elfutils-default-yama-scope       noarch 0.193-2.fc42                updates      1.8 KiB
 elfutils-libelf                   x86_64 0.193-2.fc42                updates      1.2 MiB
 elfutils-libs                     x86_64 0.193-2.fc42                updates    683.4 KiB
 fedora-gpg-keys                   noarch 42-1                        fedora     128.2 KiB
 fedora-release                    noarch 42-30                       updates      0.0   B
 fedora-release-identity-basic     noarch 42-30                       updates    646.0   B
 fedora-repos                      noarch 42-1                        fedora       4.9 KiB
 file                              x86_64 5.46-3.fc42                 updates    100.2 KiB
 file-libs                         x86_64 5.46-3.fc42                 updates     11.9 MiB
 filesystem                        x86_64 3.18-47.fc42                updates    112.0   B
 filesystem-srpm-macros            noarch 3.18-47.fc42                updates     38.2 KiB
 fonts-srpm-macros                 noarch 1:2.0.5-22.fc42             updates     55.8 KiB
 forge-srpm-macros                 noarch 0.4.0-2.fc42                fedora      38.9 KiB
 fpc-srpm-macros                   noarch 1.3-14.fc42                 fedora     144.0   B
 gdb-minimal                       x86_64 16.3-1.fc42                 updates     13.2 MiB
 gdbm-libs                         x86_64 1:1.23-9.fc42               fedora     129.9 KiB
 ghc-srpm-macros                   noarch 1.9.2-2.fc42                fedora     779.0   B
 glibc                             x86_64 2.41-11.fc42                updates      6.6 MiB
 glibc-common                      x86_64 2.41-11.fc42                updates      1.0 MiB
 glibc-gconv-extra                 x86_64 2.41-11.fc42                updates      7.2 MiB
 gmp                               x86_64 1:6.3.0-4.fc42              fedora     811.3 KiB
 gnat-srpm-macros                  noarch 6-7.fc42                    fedora       1.0 KiB
 gnulib-l10n                       noarch 20241231-1.fc42             updates    655.0 KiB
 go-srpm-macros                    noarch 3.8.0-1.fc42                updates     61.9 KiB
 jansson                           x86_64 2.14-2.fc42                 fedora      93.1 KiB
 json-c                            x86_64 0.18-2.fc42                 fedora      86.7 KiB
 kernel-srpm-macros                noarch 1.0-25.fc42                 fedora       1.9 KiB
 keyutils-libs                     x86_64 1.6.3-5.fc42                fedora      58.3 KiB
 krb5-libs                         x86_64 1.21.3-6.fc42               updates      2.3 MiB
 libacl                            x86_64 2.3.2-3.fc42                fedora      38.3 KiB
 libarchive                        x86_64 3.8.1-1.fc42                updates    955.2 KiB
 libattr                           x86_64 2.5.2-5.fc42                fedora      27.1 KiB
 libblkid                          x86_64 2.40.4-7.fc42               fedora     262.4 KiB
 libbrotli                         x86_64 1.1.0-6.fc42                fedora     841.3 KiB
 libcap                            x86_64 2.73-2.fc42                 fedora     207.1 KiB
 libcap-ng                         x86_64 0.8.5-4.fc42                fedora      72.9 KiB
 libcom_err                        x86_64 1.47.2-3.fc42               fedora      67.1 KiB
 libcurl                           x86_64 8.11.1-6.fc42               updates    834.1 KiB
 libeconf                          x86_64 0.7.6-2.fc42                updates     64.6 KiB
 libevent                          x86_64 2.1.12-15.fc42              fedora     903.1 KiB
 libfdisk                          x86_64 2.40.4-7.fc42               fedora     372.3 KiB
 libffi                            x86_64 3.4.6-5.fc42                fedora      82.3 KiB
 libgcc                            x86_64 15.2.1-1.fc42               updates    266.6 KiB
 libgomp                           x86_64 15.2.1-1.fc42               updates    541.1 KiB
 libidn2                           x86_64 2.3.8-1.fc42                fedora     556.5 KiB
 libmount                          x86_64 2.40.4-7.fc42               fedora     356.3 KiB
 libnghttp2                        x86_64 1.64.0-3.fc42               fedora     170.4 KiB
 libpkgconf                        x86_64 2.3.0-2.fc42                fedora      78.1 KiB
 libpsl                            x86_64 0.21.5-5.fc42               fedora      76.4 KiB
 libselinux                        x86_64 3.8-3.fc42                  updates    193.1 KiB
 libsemanage                       x86_64 3.8.1-2.fc42                updates    304.4 KiB
 libsepol                          x86_64 3.8-1.fc42                  fedora     826.0 KiB
 libsmartcols                      x86_64 2.40.4-7.fc42               fedora     180.4 KiB
 libssh                            x86_64 0.11.3-1.fc42               updates    567.1 KiB
 libssh-config                     noarch 0.11.3-1.fc42               updates    277.0   B
 libstdc++                         x86_64 15.2.1-1.fc42               updates      2.8 MiB
 libtasn1                          x86_64 4.20.0-1.fc42               fedora     176.3 KiB
 libtool-ltdl                      x86_64 2.5.4-4.fc42                fedora      70.1 KiB
 libunistring                      x86_64 1.1-9.fc42                  fedora       1.7 MiB
 libuuid                           x86_64 2.40.4-7.fc42               fedora      37.3 KiB
 libverto                          x86_64 0.3.2-10.fc42               fedora      25.4 KiB
 libxcrypt                         x86_64 4.4.38-7.fc42               updates    284.5 KiB
 libxml2                           x86_64 2.12.10-1.fc42              fedora       1.7 MiB
 libzstd                           x86_64 1.5.7-1.fc42                fedora     807.8 KiB
 lua-libs                          x86_64 5.4.8-1.fc42                updates    280.8 KiB
 lua-srpm-macros                   noarch 1-15.fc42                   fedora       1.3 KiB
 lz4-libs                          x86_64 1.10.0-2.fc42               fedora     157.4 KiB
 mpfr                              x86_64 4.2.2-1.fc42                fedora     828.8 KiB
 ncurses-base                      noarch 6.5-5.20250125.fc42         fedora     326.8 KiB
 ncurses-libs                      x86_64 6.5-5.20250125.fc42         fedora     946.3 KiB
 ocaml-srpm-macros                 noarch 10-4.fc42                   fedora       1.9 KiB
 openblas-srpm-macros              noarch 2-19.fc42                   fedora     112.0   B
 openldap                          x86_64 2.6.10-1.fc42               updates    655.8 KiB
 openssl-libs                      x86_64 1:3.2.4-4.fc42              updates      7.8 MiB
 p11-kit                           x86_64 0.25.8-1.fc42               updates      2.3 MiB
 p11-kit-trust                     x86_64 0.25.8-1.fc42               updates    446.5 KiB
 package-notes-srpm-macros         noarch 0.5-13.fc42                 fedora       1.6 KiB
 pam-libs                          x86_64 1.7.0-6.fc42                updates    126.7 KiB
 pcre2                             x86_64 10.45-1.fc42                fedora     697.7 KiB
 pcre2-syntax                      noarch 10.45-1.fc42                fedora     273.9 KiB
 perl-srpm-macros                  noarch 1-57.fc42                   fedora     861.0   B
 pkgconf                           x86_64 2.3.0-2.fc42                fedora      88.5 KiB
 pkgconf-m4                        noarch 2.3.0-2.fc42                fedora      14.4 KiB
 pkgconf-pkg-config                x86_64 2.3.0-2.fc42                fedora     989.0   B
 popt                              x86_64 1.19-8.fc42                 fedora     132.8 KiB
 publicsuffix-list-dafsa           noarch 20250616-1.fc42             updates     69.1 KiB
 pyproject-srpm-macros             noarch 1.18.4-1.fc42               updates      1.9 KiB
 python-srpm-macros                noarch 3.13-5.fc42                 updates     51.0 KiB
 qt5-srpm-macros                   noarch 5.15.17-1.fc42              updates    500.0   B
 qt6-srpm-macros                   noarch 6.9.2-1.fc42                updates    464.0   B
 readline                          x86_64 8.2-13.fc42                 fedora     485.0 KiB
 rpm                               x86_64 4.20.1-1.fc42               fedora       3.1 MiB
 rpm-build-libs                    x86_64 4.20.1-1.fc42               fedora     206.6 KiB
 rpm-libs                          x86_64 4.20.1-1.fc42               fedora     721.8 KiB
 rpm-sequoia                       x86_64 1.7.0-5.fc42                fedora       2.4 MiB
 rust-srpm-macros                  noarch 26.4-1.fc42                 updates      4.8 KiB
 setup                             noarch 2.15.0-13.fc42              fedora     720.9 KiB
 sqlite-libs                       x86_64 3.47.2-5.fc42               updates      1.5 MiB
 systemd-libs                      x86_64 257.9-2.fc42                updates      2.2 MiB
 systemd-standalone-sysusers       x86_64 257.9-2.fc42                updates    277.3 KiB
 tree-sitter-srpm-macros           noarch 0.1.0-8.fc42                fedora       6.5 KiB
 util-linux-core                   x86_64 2.40.4-7.fc42               fedora       1.4 MiB
 xxhash-libs                       x86_64 0.8.3-2.fc42                fedora      90.2 KiB
 xz-libs                           x86_64 1:5.8.1-2.fc42              updates    217.8 KiB
 zig-srpm-macros                   noarch 1-4.fc42                    fedora       1.1 KiB
 zip                               x86_64 3.0-43.fc42                 fedora     698.5 KiB
 zlib-ng-compat                    x86_64 2.2.5-2.fc42                updates    137.6 KiB
 zstd                              x86_64 1.5.7-1.fc42                fedora       1.7 MiB
Installing groups:
 Buildsystem building group                                                               

Transaction Summary:
 Installing:       149 packages

Total size of inbound packages is 52 MiB. Need to download 52 MiB.
After this operation, 178 MiB extra will be used (install 178 MiB, remove 0 B).
[  1/149] bzip2-0:1.0.8-20.fc42.x86_64  100% | 148.8 KiB/s |  52.1 KiB |  00m00s
[  2/149] cpio-0:2.15-4.fc42.x86_64     100% | 553.8 KiB/s | 294.6 KiB |  00m01s
[  3/149] grep-0:3.11-10.fc42.x86_64    100% |   1.2 MiB/s | 300.1 KiB |  00m00s
[  4/149] findutils-1:4.10.0-5.fc42.x86 100% |   1.0 MiB/s | 551.5 KiB |  00m01s
[  5/149] gzip-0:1.13-3.fc42.x86_64     100% | 718.8 KiB/s | 170.4 KiB |  00m00s
[  6/149] info-0:7.2-3.fc42.x86_64      100% |   1.0 MiB/s | 183.8 KiB |  00m00s
[  7/149] bash-0:5.2.37-1.fc42.x86_64   100% |   1.7 MiB/s |   1.8 MiB |  00m01s
[  8/149] rpm-build-0:4.20.1-1.fc42.x86 100% | 779.2 KiB/s |  81.8 KiB |  00m00s
[  9/149] sed-0:4.9-4.fc42.x86_64       100% |   1.4 MiB/s | 317.3 KiB |  00m00s
[ 10/149] shadow-utils-2:4.17.4-1.fc42. 100% |   4.4 MiB/s |   1.3 MiB |  00m00s
[ 11/149] unzip-0:6.0-66.fc42.x86_64    100% |   1.1 MiB/s | 184.6 KiB |  00m00s
[ 12/149] tar-2:1.35-5.fc42.x86_64      100% |   2.3 MiB/s | 862.5 KiB |  00m00s
[ 13/149] fedora-release-common-0:42-30 100% |  60.6 KiB/s |  24.5 KiB |  00m00s
[ 14/149] gawk-0:5.3.1-1.fc42.x86_64    100% |   2.7 MiB/s |   1.1 MiB |  00m00s
[ 15/149] diffutils-0:3.12-1.fc42.x86_6 100% | 310.1 KiB/s | 392.6 KiB |  00m01s
[ 16/149] patch-0:2.8-1.fc42.x86_64     100% | 279.5 KiB/s | 113.5 KiB |  00m00s
[ 17/149] glibc-minimal-langpack-0:2.41 100% | 106.3 KiB/s |  98.7 KiB |  00m01s
[ 18/149] redhat-rpm-config-0:342-4.fc4 100% | 340.7 KiB/s |  81.1 KiB |  00m00s
[ 19/149] util-linux-0:2.40.4-7.fc42.x8 100% |   4.9 MiB/s |   1.2 MiB |  00m00s
[ 20/149] which-0:2.23-2.fc42.x86_64    100% | 257.7 KiB/s |  41.7 KiB |  00m00s
[ 21/149] ncurses-libs-0:6.5-5.20250125 100% |   2.8 MiB/s | 335.0 KiB |  00m00s
[ 22/149] bzip2-libs-0:1.0.8-20.fc42.x8 100% | 613.7 KiB/s |  43.6 KiB |  00m00s
[ 23/149] pcre2-0:10.45-1.fc42.x86_64   100% |   2.5 MiB/s | 262.8 KiB |  00m00s
[ 24/149] popt-0:1.19-8.fc42.x86_64     100% | 867.7 KiB/s |  65.9 KiB |  00m00s
[ 25/149] coreutils-0:9.6-6.fc42.x86_64 100% | 457.5 KiB/s |   1.1 MiB |  00m03s
[ 26/149] readline-0:8.2-13.fc42.x86_64 100% |   2.2 MiB/s | 215.2 KiB |  00m00s
[ 27/149] rpm-build-libs-0:4.20.1-1.fc4 100% |   1.2 MiB/s |  99.7 KiB |  00m00s
[ 28/149] rpm-libs-0:4.20.1-1.fc42.x86_ 100% |   2.8 MiB/s | 312.0 KiB |  00m00s
[ 29/149] zstd-0:1.5.7-1.fc42.x86_64    100% |   3.5 MiB/s | 485.9 KiB |  00m00s
[ 30/149] libacl-0:2.3.2-3.fc42.x86_64  100% | 333.4 KiB/s |  23.0 KiB |  00m00s
[ 31/149] setup-0:2.15.0-13.fc42.noarch 100% |   1.7 MiB/s | 155.8 KiB |  00m00s
[ 32/149] rpm-0:4.20.1-1.fc42.x86_64    100% | 928.0 KiB/s | 548.4 KiB |  00m01s
[ 33/149] gmp-1:6.3.0-4.fc42.x86_64     100% |   2.6 MiB/s | 317.7 KiB |  00m00s
[ 34/149] libattr-0:2.5.2-5.fc42.x86_64 100% | 227.8 KiB/s |  17.1 KiB |  00m00s
[ 35/149] libcap-0:2.73-2.fc42.x86_64   100% | 969.0 KiB/s |  84.3 KiB |  00m00s
[ 36/149] fedora-repos-0:42-1.noarch    100% | 137.7 KiB/s |   9.2 KiB |  00m00s
[ 37/149] mpfr-0:4.2.2-1.fc42.x86_64    100% |   2.9 MiB/s | 345.3 KiB |  00m00s
[ 38/149] xz-1:5.8.1-2.fc42.x86_64      100% | 333.9 KiB/s | 572.6 KiB |  00m02s
[ 39/149] ed-0:1.21-2.fc42.x86_64       100% |   1.0 MiB/s |  82.0 KiB |  00m00s
[ 40/149] ansible-srpm-macros-0:1-17.1. 100% | 298.7 KiB/s |  20.3 KiB |  00m00s
[ 41/149] build-reproducibility-srpm-ma 100% | 174.4 KiB/s |  11.7 KiB |  00m00s
[ 42/149] forge-srpm-macros-0:0.4.0-2.f 100% | 291.9 KiB/s |  19.9 KiB |  00m00s
[ 43/149] fpc-srpm-macros-0:1.3-14.fc42 100% | 121.5 KiB/s |   8.0 KiB |  00m00s
[ 44/149] ghc-srpm-macros-0:1.9.2-2.fc4 100% | 136.7 KiB/s |   9.2 KiB |  00m00s
[ 45/149] gnat-srpm-macros-0:6-7.fc42.n 100% | 128.5 KiB/s |   8.6 KiB |  00m00s
[ 46/149] kernel-srpm-macros-0:1.0-25.f 100% | 143.1 KiB/s |   9.9 KiB |  00m00s
[ 47/149] lua-srpm-macros-0:1-15.fc42.n 100% | 125.6 KiB/s |   8.9 KiB |  00m00s
[ 48/149] glibc-common-0:2.41-11.fc42.x 100% | 397.5 KiB/s | 385.6 KiB |  00m01s
[ 49/149] ocaml-srpm-macros-0:10-4.fc42 100% | 124.4 KiB/s |   9.2 KiB |  00m00s
[ 50/149] openblas-srpm-macros-0:2-19.f 100% | 110.9 KiB/s |   7.8 KiB |  00m00s
[ 51/149] package-notes-srpm-macros-0:0 100% | 138.2 KiB/s |   9.3 KiB |  00m00s
[ 52/149] perl-srpm-macros-0:1-57.fc42. 100% | 119.8 KiB/s |   8.5 KiB |  00m00s
[ 53/149] tree-sitter-srpm-macros-0:0.1 100% | 167.6 KiB/s |  11.2 KiB |  00m00s
[ 54/149] zig-srpm-macros-0:1-4.fc42.no 100% | 124.9 KiB/s |   8.2 KiB |  00m00s
[ 55/149] libblkid-0:2.40.4-7.fc42.x86_ 100% |   1.4 MiB/s | 122.5 KiB |  00m00s
[ 56/149] zip-0:3.0-43.fc42.x86_64      100% |   2.5 MiB/s | 263.5 KiB |  00m00s
[ 57/149] libcap-ng-0:0.8.5-4.fc42.x86_ 100% | 466.2 KiB/s |  32.2 KiB |  00m00s
[ 58/149] libfdisk-0:2.40.4-7.fc42.x86_ 100% |   1.8 MiB/s | 158.5 KiB |  00m00s
[ 59/149] libmount-0:2.40.4-7.fc42.x86_ 100% |   1.8 MiB/s | 155.1 KiB |  00m00s
[ 60/149] libsmartcols-0:2.40.4-7.fc42. 100% |   1.0 MiB/s |  81.2 KiB |  00m00s
[ 61/149] libuuid-0:2.40.4-7.fc42.x86_6 100% | 367.2 KiB/s |  25.3 KiB |  00m00s
[ 62/149] util-linux-core-0:2.40.4-7.fc 100% |   3.7 MiB/s | 529.2 KiB |  00m00s
[ 63/149] ncurses-base-0:6.5-5.20250125 100% |   1.1 MiB/s |  88.1 KiB |  00m00s
[ 64/149] pcre2-syntax-0:10.45-1.fc42.n 100% |   1.8 MiB/s | 161.7 KiB |  00m00s
[ 65/149] xz-libs-1:5.8.1-2.fc42.x86_64 100% | 357.6 KiB/s | 113.0 KiB |  00m00s
[ 66/149] libzstd-0:1.5.7-1.fc42.x86_64 100% |   2.8 MiB/s | 314.8 KiB |  00m00s
[ 67/149] lz4-libs-0:1.10.0-2.fc42.x86_ 100% |   1.0 MiB/s |  78.1 KiB |  00m00s
[ 68/149] rpm-sequoia-0:1.7.0-5.fc42.x8 100% |   4.2 MiB/s | 911.1 KiB |  00m00s
[ 69/149] fedora-gpg-keys-0:42-1.noarch 100% |   1.6 MiB/s | 135.6 KiB |  00m00s
[ 70/149] gnulib-l10n-0:20241231-1.fc42 100% | 475.0 KiB/s | 150.1 KiB |  00m00s
[ 71/149] add-determinism-0:0.6.0-1.fc4 100% |   4.9 MiB/s | 918.3 KiB |  00m00s
[ 72/149] coreutils-common-0:9.6-6.fc42 100% | 783.9 KiB/s |   2.1 MiB |  00m03s
[ 73/149] basesystem-0:11-22.fc42.noarc 100% | 102.7 KiB/s |   7.3 KiB |  00m00s
[ 74/149] dwz-0:0.16-1.fc42.x86_64      100% | 489.3 KiB/s | 135.5 KiB |  00m00s
[ 75/149] efi-srpm-macros-0:6-3.fc42.no 100% | 284.7 KiB/s |  22.5 KiB |  00m00s
[ 76/149] file-0:5.46-3.fc42.x86_64     100% | 298.5 KiB/s |  48.6 KiB |  00m00s
[ 77/149] file-libs-0:5.46-3.fc42.x86_6 100% | 816.8 KiB/s | 849.5 KiB |  00m01s
[ 78/149] filesystem-srpm-macros-0:3.18 100% | 318.0 KiB/s |  26.1 KiB |  00m00s
[ 79/149] fonts-srpm-macros-1:2.0.5-22. 100% | 299.0 KiB/s |  27.2 KiB |  00m00s
[ 80/149] go-srpm-macros-0:3.8.0-1.fc42 100% | 341.0 KiB/s |  28.3 KiB |  00m00s
[ 81/149] pyproject-srpm-macros-0:1.18. 100% | 165.4 KiB/s |  13.7 KiB |  00m00s
[ 82/149] glibc-gconv-extra-0:2.41-11.f 100% | 763.1 KiB/s |   1.6 MiB |  00m02s
[ 83/149] python-srpm-macros-0:3.13-5.f 100% | 249.6 KiB/s |  22.5 KiB |  00m00s
[ 84/149] qt5-srpm-macros-0:5.15.17-1.f 100% |  95.8 KiB/s |   8.7 KiB |  00m00s
[ 85/149] qt6-srpm-macros-0:6.9.2-1.fc4 100% | 104.2 KiB/s |   9.4 KiB |  00m00s
[ 86/149] rust-srpm-macros-0:26.4-1.fc4 100% | 106.6 KiB/s |  11.2 KiB |  00m00s
[ 87/149] libgcc-0:15.2.1-1.fc42.x86_64 100% | 491.0 KiB/s | 131.6 KiB |  00m00s
[ 88/149] zlib-ng-compat-0:2.2.5-2.fc42 100% | 167.8 KiB/s |  79.2 KiB |  00m00s
[ 89/149] glibc-0:2.41-11.fc42.x86_64   100% | 699.3 KiB/s |   2.2 MiB |  00m03s
[ 90/149] elfutils-libelf-0:0.193-2.fc4 100% | 637.4 KiB/s | 207.8 KiB |  00m00s
[ 91/149] elfutils-libs-0:0.193-2.fc42. 100% | 678.9 KiB/s | 270.2 KiB |  00m00s
[ 92/149] elfutils-debuginfod-client-0: 100% | 295.2 KiB/s |  46.9 KiB |  00m00s
[ 93/149] filesystem-0:3.18-47.fc42.x86 100% | 805.0 KiB/s |   1.3 MiB |  00m02s
[ 94/149] json-c-0:0.18-2.fc42.x86_64   100% | 168.2 KiB/s |  44.9 KiB |  00m00s
[ 95/149] libselinux-0:3.8-3.fc42.x86_6 100% | 608.0 KiB/s |  96.7 KiB |  00m00s
[ 96/149] elfutils-0:0.193-2.fc42.x86_6 100% | 712.5 KiB/s | 571.4 KiB |  00m01s
[ 97/149] libsepol-0:3.8-1.fc42.x86_64  100% |   1.2 MiB/s | 348.9 KiB |  00m00s
[ 98/149] libxcrypt-0:4.4.38-7.fc42.x86 100% | 530.1 KiB/s | 127.2 KiB |  00m00s
[ 99/149] audit-libs-0:4.1.1-1.fc42.x86 100% | 574.7 KiB/s | 138.5 KiB |  00m00s
[100/149] pam-libs-0:1.7.0-6.fc42.x86_6 100% | 340.4 KiB/s |  57.5 KiB |  00m00s
[101/149] libeconf-0:0.7.6-2.fc42.x86_6 100% | 439.6 KiB/s |  35.2 KiB |  00m00s
[102/149] systemd-libs-0:257.9-2.fc42.x 100% | 846.8 KiB/s | 810.3 KiB |  00m01s
[103/149] libsemanage-0:3.8.1-2.fc42.x8 100% | 765.5 KiB/s | 123.2 KiB |  00m00s
[104/149] libstdc++-0:15.2.1-1.fc42.x86 100% | 826.9 KiB/s | 917.8 KiB |  00m01s
[105/149] lua-libs-0:5.4.8-1.fc42.x86_6 100% | 547.4 KiB/s | 131.9 KiB |  00m00s
[106/149] libgomp-0:15.2.1-1.fc42.x86_6 100% | 774.2 KiB/s | 371.6 KiB |  00m00s
[107/149] sqlite-libs-0:3.47.2-5.fc42.x 100% | 785.2 KiB/s | 753.8 KiB |  00m01s
[108/149] jansson-0:2.14-2.fc42.x86_64  100% | 168.1 KiB/s |  45.7 KiB |  00m00s
[109/149] debugedit-0:5.1-7.fc42.x86_64 100% | 492.4 KiB/s |  78.8 KiB |  00m00s
[110/149] libarchive-0:3.8.1-1.fc42.x86 100% | 518.5 KiB/s | 421.6 KiB |  00m01s
[111/149] libxml2-0:2.12.10-1.fc42.x86_ 100% |   1.2 MiB/s | 683.7 KiB |  00m01s
[112/149] pkgconf-pkg-config-0:2.3.0-2. 100% | 141.8 KiB/s |   9.9 KiB |  00m00s
[113/149] pkgconf-0:2.3.0-2.fc42.x86_64 100% | 582.9 KiB/s |  44.9 KiB |  00m00s
[114/149] pkgconf-m4-0:2.3.0-2.fc42.noa 100% | 189.8 KiB/s |  14.2 KiB |  00m00s
[115/149] libpkgconf-0:2.3.0-2.fc42.x86 100% | 485.7 KiB/s |  38.4 KiB |  00m00s
[116/149] openssl-libs-1:3.2.4-4.fc42.x 100% | 655.7 KiB/s |   2.3 MiB |  00m04s
[117/149] curl-0:8.11.1-6.fc42.x86_64   100% | 462.2 KiB/s | 220.0 KiB |  00m00s
[118/149] crypto-policies-0:20250707-1. 100% | 592.4 KiB/s |  96.0 KiB |  00m00s
[119/149] elfutils-default-yama-scope-0 100% | 157.3 KiB/s |  12.6 KiB |  00m00s
[120/149] libffi-0:3.4.6-5.fc42.x86_64  100% | 578.7 KiB/s |  39.9 KiB |  00m00s
[121/149] p11-kit-0:0.25.8-1.fc42.x86_6 100% | 273.4 KiB/s | 503.5 KiB |  00m02s
[122/149] libtasn1-0:4.20.0-1.fc42.x86_ 100% |   1.0 MiB/s |  75.0 KiB |  00m00s
[123/149] p11-kit-trust-0:0.25.8-1.fc42 100% | 282.9 KiB/s | 139.2 KiB |  00m00s
[124/149] alternatives-0:1.33-1.fc42.x8 100% | 248.7 KiB/s |  40.5 KiB |  00m00s
[125/149] fedora-release-0:42-30.noarch 100% | 169.0 KiB/s |  13.5 KiB |  00m00s
[126/149] ca-certificates-0:2025.2.80_v 100% | 312.6 KiB/s | 973.5 KiB |  00m03s
[127/149] fedora-release-identity-basic 100% | 172.2 KiB/s |  14.3 KiB |  00m00s
[128/149] libbrotli-0:1.1.0-6.fc42.x86_ 100% |   3.2 MiB/s | 339.8 KiB |  00m00s
[129/149] libidn2-0:2.3.8-1.fc42.x86_64 100% |   2.1 MiB/s | 174.8 KiB |  00m00s
[130/149] libnghttp2-0:1.64.0-3.fc42.x8 100% |   1.0 MiB/s |  77.7 KiB |  00m00s
[131/149] libpsl-0:0.21.5-5.fc42.x86_64 100% | 902.0 KiB/s |  64.0 KiB |  00m00s
[132/149] libcurl-0:8.11.1-6.fc42.x86_6 100% | 356.7 KiB/s | 371.7 KiB |  00m01s
[133/149] libssh-0:0.11.3-1.fc42.x86_64 100% | 316.6 KiB/s | 233.0 KiB |  00m01s
[134/149] libunistring-0:1.1-9.fc42.x86 100% |   4.3 MiB/s | 542.5 KiB |  00m00s
[135/149] libssh-config-0:0.11.3-1.fc42 100% | 103.5 KiB/s |   9.1 KiB |  00m00s
[136/149] xxhash-libs-0:0.8.3-2.fc42.x8 100% | 574.5 KiB/s |  39.1 KiB |  00m00s
[137/149] systemd-standalone-sysusers-0 100% | 486.9 KiB/s | 154.8 KiB |  00m00s
[138/149] publicsuffix-list-dafsa-0:202 100% | 344.0 KiB/s |  59.2 KiB |  00m00s
[139/149] krb5-libs-0:1.21.3-6.fc42.x86 100% | 789.8 KiB/s | 759.8 KiB |  00m01s
[140/149] keyutils-libs-0:1.6.3-5.fc42. 100% | 470.7 KiB/s |  31.5 KiB |  00m00s
[141/149] libcom_err-0:1.47.2-3.fc42.x8 100% | 401.9 KiB/s |  26.9 KiB |  00m00s
[142/149] libverto-0:0.3.2-10.fc42.x86_ 100% | 310.5 KiB/s |  20.8 KiB |  00m00s
[143/149] openldap-0:2.6.10-1.fc42.x86_ 100% | 289.9 KiB/s | 258.6 KiB |  00m01s
[144/149] binutils-0:2.44-6.fc42.x86_64 100% | 600.1 KiB/s |   5.8 MiB |  00m10s
[145/149] cyrus-sasl-lib-0:2.1.28-30.fc 100% |   5.9 MiB/s | 793.5 KiB |  00m00s
[146/149] libtool-ltdl-0:2.5.4-4.fc42.x 100% | 539.9 KiB/s |  36.2 KiB |  00m00s
[147/149] gdbm-libs-1:1.23-9.fc42.x86_6 100% | 838.6 KiB/s |  57.0 KiB |  00m00s
[148/149] libevent-0:2.1.12-15.fc42.x86 100% | 579.5 KiB/s | 260.2 KiB |  00m00s
[149/149] gdb-minimal-0:16.3-1.fc42.x86 100% | 832.7 KiB/s |   4.4 MiB |  00m05s
--------------------------------------------------------------------------------
[149/149] Total                         100% |   2.0 MiB/s |  52.4 MiB |  00m26s
Running transaction
Importing OpenPGP key 0x105EF944:
 UserID     : "Fedora (42) <fedora-42-primary@fedoraproject.org>"
 Fingerprint: B0F4950458F69E1150C6C5EDC8AC4916105EF944
 From       : file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-42-primary
The key was successfully imported.
[  1/151] Verify package files          100% | 726.0   B/s | 149.0   B |  00m00s
[  2/151] Prepare transaction           100% |   1.9 KiB/s | 149.0   B |  00m00s
[  3/151] Installing libgcc-0:15.2.1-1. 100% | 130.9 MiB/s | 268.2 KiB |  00m00s
[  4/151] Installing publicsuffix-list- 100% |  68.2 MiB/s |  69.8 KiB |  00m00s
[  5/151] Installing libssh-config-0:0. 100% |   0.0   B/s | 816.0   B |  00m00s
[  6/151] Installing fedora-release-ide 100% | 882.8 KiB/s | 904.0   B |  00m00s
[  7/151] Installing fedora-gpg-keys-0: 100% |  19.0 MiB/s | 174.8 KiB |  00m00s
[  8/151] Installing fedora-repos-0:42- 100% |   0.0   B/s |   5.7 KiB |  00m00s
[  9/151] Installing fedora-release-com 100% |  12.0 MiB/s |  24.5 KiB |  00m00s
[ 10/151] Installing fedora-release-0:4 100% |   3.6 KiB/s | 124.0   B |  00m00s
>>> Running sysusers scriptlet: setup-0:2.15.0-13.fc42.noarch                   
>>> Finished sysusers scriptlet: setup-0:2.15.0-13.fc42.noarch                  
>>> Scriptlet output:                                                           
>>> Creating group 'adm' with GID 4.                                            
>>> Creating group 'audio' with GID 63.                                         
>>> Creating group 'bin' with GID 1.                                            
>>> Creating group 'cdrom' with GID 11.                                         
>>> Creating group 'clock' with GID 103.                                        
>>> Creating group 'daemon' with GID 2.                                         
>>> Creating group 'dialout' with GID 18.                                       
>>> Creating group 'disk' with GID 6.                                           
>>> Creating group 'floppy' with GID 19.                                        
>>> Creating group 'ftp' with GID 50.                                           
>>> Creating group 'games' with GID 20.                                         
>>> Creating group 'input' with GID 104.                                        
>>> Creating group 'kmem' with GID 9.                                           
>>> Creating group 'kvm' with GID 36.                                           
>>> Creating group 'lock' with GID 54.                                          
>>> Creating group 'lp' with GID 7.                                             
>>> Creating group 'mail' with GID 12.                                          
>>> Creating group 'man' with GID 15.                                           
>>> Creating group 'mem' with GID 8.                                            
>>> Creating group 'nobody' with GID 65534.                                     
>>> Creating group 'render' with GID 105.                                       
>>> Creating group 'root' with GID 0.                                           
>>> Creating group 'sgx' with GID 106.                                          
>>> Creating group 'sys' with GID 3.                                            
>>> Creating group 'tape' with GID 33.                                          
>>> Creating group 'tty' with GID 5.                                            
>>> Creating group 'users' with GID 100.                                        
>>> Creating group 'utmp' with GID 22.                                          
>>> Creating group 'video' with GID 39.                                         
>>> Creating group 'wheel' with GID 10.                                         
>>>                                                                             
>>> Running sysusers scriptlet: setup-0:2.15.0-13.fc42.noarch                   
>>> Finished sysusers scriptlet: setup-0:2.15.0-13.fc42.noarch                  
>>> Scriptlet output:                                                           
>>> Creating user 'adm' (adm) with UID 3 and GID 4.                             
>>> Creating user 'bin' (bin) with UID 1 and GID 1.                             
>>> Creating user 'daemon' (daemon) with UID 2 and GID 2.                       
>>> Creating user 'ftp' (FTP User) with UID 14 and GID 50.                      
>>> Creating user 'games' (games) with UID 12 and GID 20.                       
>>> Creating user 'halt' (halt) with UID 7 and GID 0.                           
>>> Creating user 'lp' (lp) with UID 4 and GID 7.                               
>>> Creating user 'mail' (mail) with UID 8 and GID 12.                          
>>> Creating user 'nobody' (Kernel Overflow User) with UID 65534 and GID 65534. 
>>> Creating user 'operator' (operator) with UID 11 and GID 0.                  
>>> Creating user 'root' (Super User) with UID 0 and GID 0.                     
>>> Creating user 'shutdown' (shutdown) with UID 6 and GID 0.                   
>>> Creating user 'sync' (sync) with UID 5 and GID 0.                           
>>>                                                                             
[ 11/151] Installing setup-0:2.15.0-13. 100% |  39.4 MiB/s | 726.7 KiB |  00m00s
>>> [RPM] /etc/hosts created as /etc/hosts.rpmnew                               
[ 12/151] Installing filesystem-0:3.18- 100% |   1.4 MiB/s | 212.8 KiB |  00m00s
[ 13/151] Installing basesystem-0:11-22 100% |   0.0   B/s | 124.0   B |  00m00s
[ 14/151] Installing pkgconf-m4-0:2.3.0 100% |   0.0   B/s |  14.8 KiB |  00m00s
[ 15/151] Installing rust-srpm-macros-0 100% |   0.0   B/s |   5.6 KiB |  00m00s
[ 16/151] Installing qt6-srpm-macros-0: 100% |   0.0   B/s | 740.0   B |  00m00s
[ 17/151] Installing qt5-srpm-macros-0: 100% |   0.0   B/s | 776.0   B |  00m00s
[ 18/151] Installing gnulib-l10n-0:2024 100% | 107.7 MiB/s | 661.9 KiB |  00m00s
[ 19/151] Installing coreutils-common-0 100% | 242.5 MiB/s |  11.2 MiB |  00m00s
[ 20/151] Installing pcre2-syntax-0:10. 100% | 135.0 MiB/s | 276.4 KiB |  00m00s
[ 21/151] Installing ncurses-base-0:6.5 100% |  38.2 MiB/s | 352.2 KiB |  00m00s
[ 22/151] Installing glibc-minimal-lang 100% |   0.0   B/s | 124.0   B |  00m00s
[ 23/151] Installing ncurses-libs-0:6.5 100% | 155.1 MiB/s | 952.8 KiB |  00m00s
[ 24/151] Installing glibc-0:2.41-11.fc 100% | 154.7 MiB/s |   6.7 MiB |  00m00s
[ 25/151] Installing bash-0:5.2.37-1.fc 100% | 199.3 MiB/s |   8.2 MiB |  00m00s
[ 26/151] Installing glibc-common-0:2.4 100% |  51.0 MiB/s |   1.0 MiB |  00m00s
[ 27/151] Installing glibc-gconv-extra- 100% | 143.3 MiB/s |   7.3 MiB |  00m00s
[ 28/151] Installing zlib-ng-compat-0:2 100% | 135.2 MiB/s | 138.4 KiB |  00m00s
[ 29/151] Installing bzip2-libs-0:1.0.8 100% |  83.7 MiB/s |  85.7 KiB |  00m00s
[ 30/151] Installing xz-libs-1:5.8.1-2. 100% | 213.8 MiB/s | 218.9 KiB |  00m00s
[ 31/151] Installing libuuid-0:2.40.4-7 100% |  37.5 MiB/s |  38.4 KiB |  00m00s
[ 32/151] Installing libblkid-0:2.40.4- 100% | 128.7 MiB/s | 263.5 KiB |  00m00s
[ 33/151] Installing popt-0:1.19-8.fc42 100% |  27.2 MiB/s | 139.4 KiB |  00m00s
[ 34/151] Installing readline-0:8.2-13. 100% | 237.9 MiB/s | 487.1 KiB |  00m00s
[ 35/151] Installing gmp-1:6.3.0-4.fc42 100% | 264.8 MiB/s | 813.5 KiB |  00m00s
[ 36/151] Installing libzstd-0:1.5.7-1. 100% | 263.4 MiB/s | 809.1 KiB |  00m00s
[ 37/151] Installing elfutils-libelf-0: 100% | 233.3 MiB/s |   1.2 MiB |  00m00s
[ 38/151] Installing libstdc++-0:15.2.1 100% | 257.8 MiB/s |   2.8 MiB |  00m00s
[ 39/151] Installing libxcrypt-0:4.4.38 100% | 140.2 MiB/s | 287.2 KiB |  00m00s
[ 40/151] Installing libattr-0:2.5.2-5. 100% |  27.4 MiB/s |  28.1 KiB |  00m00s
[ 41/151] Installing libacl-0:2.3.2-3.f 100% |  38.2 MiB/s |  39.2 KiB |  00m00s
[ 42/151] Installing dwz-0:0.16-1.fc42. 100% |  20.1 MiB/s | 288.5 KiB |  00m00s
[ 43/151] Installing mpfr-0:4.2.2-1.fc4 100% | 202.7 MiB/s | 830.4 KiB |  00m00s
[ 44/151] Installing gawk-0:5.3.1-1.fc4 100% |  77.0 MiB/s |   1.7 MiB |  00m00s
[ 45/151] Installing unzip-0:6.0-66.fc4 100% |  25.6 MiB/s | 393.8 KiB |  00m00s
[ 46/151] Installing file-libs-0:5.46-3 100% | 494.1 MiB/s |  11.9 MiB |  00m00s
[ 47/151] Installing file-0:5.46-3.fc42 100% |   3.5 MiB/s | 101.7 KiB |  00m00s
[ 48/151] Installing crypto-policies-0: 100% |  16.4 MiB/s | 167.8 KiB |  00m00s
[ 49/151] Installing pcre2-0:10.45-1.fc 100% | 227.6 MiB/s | 699.1 KiB |  00m00s
[ 50/151] Installing grep-0:3.11-10.fc4 100% |  45.6 MiB/s |   1.0 MiB |  00m00s
[ 51/151] Installing xz-1:5.8.1-2.fc42. 100% |  57.9 MiB/s |   1.3 MiB |  00m00s
[ 52/151] Installing libcap-ng-0:0.8.5- 100% |  73.1 MiB/s |  74.8 KiB |  00m00s
[ 53/151] Installing audit-libs-0:4.1.1 100% | 124.2 MiB/s | 381.5 KiB |  00m00s
[ 54/151] Installing libsmartcols-0:2.4 100% | 177.3 MiB/s | 181.5 KiB |  00m00s
[ 55/151] Installing lz4-libs-0:1.10.0- 100% | 154.7 MiB/s | 158.5 KiB |  00m00s
[ 56/151] Installing libsepol-0:3.8-1.f 100% | 269.2 MiB/s | 827.0 KiB |  00m00s
[ 57/151] Installing libselinux-0:3.8-3 100% |  94.9 MiB/s | 194.3 KiB |  00m00s
[ 58/151] Installing findutils-1:4.10.0 100% |  81.5 MiB/s |   1.9 MiB |  00m00s
[ 59/151] Installing sed-0:4.9-4.fc42.x 100% |  44.5 MiB/s | 865.5 KiB |  00m00s
[ 60/151] Installing libmount-0:2.40.4- 100% | 174.5 MiB/s | 357.3 KiB |  00m00s
[ 61/151] Installing libeconf-0:0.7.6-2 100% |  64.7 MiB/s |  66.2 KiB |  00m00s
[ 62/151] Installing pam-libs-0:1.7.0-6 100% |  63.0 MiB/s | 129.1 KiB |  00m00s
[ 63/151] Installing libcap-0:2.73-2.fc 100% |  13.8 MiB/s | 212.1 KiB |  00m00s
[ 64/151] Installing systemd-libs-0:257 100% | 248.0 MiB/s |   2.2 MiB |  00m00s
[ 65/151] Installing lua-libs-0:5.4.8-1 100% | 137.7 MiB/s | 282.0 KiB |  00m00s
[ 66/151] Installing libffi-0:3.4.6-5.f 100% |  81.7 MiB/s |  83.7 KiB |  00m00s
[ 67/151] Installing libtasn1-0:4.20.0- 100% |  87.0 MiB/s | 178.1 KiB |  00m00s
[ 68/151] Installing p11-kit-0:0.25.8-1 100% |  84.8 MiB/s |   2.3 MiB |  00m00s
[ 69/151] Installing alternatives-0:1.3 100% |   4.8 MiB/s |  63.8 KiB |  00m00s
[ 70/151] Installing libunistring-0:1.1 100% | 246.7 MiB/s |   1.7 MiB |  00m00s
[ 71/151] Installing libidn2-0:2.3.8-1. 100% |  91.6 MiB/s | 562.7 KiB |  00m00s
[ 72/151] Installing libpsl-0:0.21.5-5. 100% |  75.7 MiB/s |  77.5 KiB |  00m00s
[ 73/151] Installing p11-kit-trust-0:0. 100% |  14.6 MiB/s | 448.3 KiB |  00m00s
[ 74/151] Installing openssl-libs-1:3.2 100% | 279.3 MiB/s |   7.8 MiB |  00m00s
[ 75/151] Installing coreutils-0:9.6-6. 100% | 104.8 MiB/s |   5.5 MiB |  00m00s
[ 76/151] Installing ca-certificates-0: 100% |   1.2 MiB/s |   2.5 MiB |  00m02s
[ 77/151] Installing gzip-0:1.13-3.fc42 100% |  22.9 MiB/s | 398.4 KiB |  00m00s
[ 78/151] Installing rpm-sequoia-0:1.7. 100% | 268.3 MiB/s |   2.4 MiB |  00m00s
[ 79/151] Installing libevent-0:2.1.12- 100% | 177.1 MiB/s | 906.9 KiB |  00m00s
[ 80/151] Installing util-linux-core-0: 100% |  62.0 MiB/s |   1.4 MiB |  00m00s
[ 81/151] Installing systemd-standalone 100% |  19.4 MiB/s | 277.8 KiB |  00m00s
[ 82/151] Installing tar-2:1.35-5.fc42. 100% | 113.9 MiB/s |   3.0 MiB |  00m00s
[ 83/151] Installing libsemanage-0:3.8. 100% |  99.7 MiB/s | 306.2 KiB |  00m00s
[ 84/151] Installing shadow-utils-2:4.1 100% |  86.0 MiB/s |   4.0 MiB |  00m00s
[ 85/151] Installing zstd-0:1.5.7-1.fc4 100% |  90.0 MiB/s |   1.7 MiB |  00m00s
[ 86/151] Installing zip-0:3.0-43.fc42. 100% |  42.9 MiB/s | 702.4 KiB |  00m00s
[ 87/151] Installing libfdisk-0:2.40.4- 100% | 182.3 MiB/s | 373.4 KiB |  00m00s
[ 88/151] Installing libxml2-0:2.12.10- 100% |  89.3 MiB/s |   1.7 MiB |  00m00s
[ 89/151] Installing libarchive-0:3.8.1 100% | 186.9 MiB/s | 957.1 KiB |  00m00s
[ 90/151] Installing bzip2-0:1.0.8-20.f 100% |   5.6 MiB/s | 103.8 KiB |  00m00s
[ 91/151] Installing add-determinism-0: 100% | 117.4 MiB/s |   2.5 MiB |  00m00s
[ 92/151] Installing build-reproducibil 100% |   1.0 MiB/s |   1.0 KiB |  00m00s
[ 93/151] Installing sqlite-libs-0:3.47 100% | 252.1 MiB/s |   1.5 MiB |  00m00s
[ 94/151] Installing rpm-libs-0:4.20.1- 100% | 176.6 MiB/s | 723.4 KiB |  00m00s
[ 95/151] Installing ed-0:1.21-2.fc42.x 100% |  10.4 MiB/s | 148.8 KiB |  00m00s
[ 96/151] Installing patch-0:2.8-1.fc42 100% |  15.6 MiB/s | 224.3 KiB |  00m00s
[ 97/151] Installing filesystem-srpm-ma 100% |  38.0 MiB/s |  38.9 KiB |  00m00s
[ 98/151] Installing elfutils-default-y 100% | 145.9 KiB/s |   2.0 KiB |  00m00s
[ 99/151] Installing elfutils-libs-0:0. 100% | 167.3 MiB/s | 685.2 KiB |  00m00s
[100/151] Installing cpio-0:2.15-4.fc42 100% |  52.4 MiB/s |   1.1 MiB |  00m00s
[101/151] Installing diffutils-0:3.12-1 100% |  71.0 MiB/s |   1.6 MiB |  00m00s
[102/151] Installing json-c-0:0.18-2.fc 100% |  43.0 MiB/s |  88.0 KiB |  00m00s
[103/151] Installing libgomp-0:15.2.1-1 100% | 176.6 MiB/s | 542.5 KiB |  00m00s
[104/151] Installing rpm-build-libs-0:4 100% | 101.3 MiB/s | 207.4 KiB |  00m00s
[105/151] Installing jansson-0:2.14-2.f 100% |  92.2 MiB/s |  94.4 KiB |  00m00s
[106/151] Installing libpkgconf-0:2.3.0 100% |  77.4 MiB/s |  79.2 KiB |  00m00s
[107/151] Installing pkgconf-0:2.3.0-2. 100% |   6.3 MiB/s |  91.0 KiB |  00m00s
[108/151] Installing pkgconf-pkg-config 100% | 136.4 KiB/s |   1.8 KiB |  00m00s
[109/151] Installing libbrotli-0:1.1.0- 100% | 205.9 MiB/s | 843.6 KiB |  00m00s
[110/151] Installing libnghttp2-0:1.64. 100% |  83.7 MiB/s | 171.5 KiB |  00m00s
[111/151] Installing xxhash-libs-0:0.8. 100% |  89.4 MiB/s |  91.6 KiB |  00m00s
[112/151] Installing keyutils-libs-0:1. 100% |  58.3 MiB/s |  59.7 KiB |  00m00s
[113/151] Installing libcom_err-0:1.47. 100% |  66.6 MiB/s |  68.2 KiB |  00m00s
[114/151] Installing libverto-0:0.3.2-1 100% |  26.6 MiB/s |  27.2 KiB |  00m00s
[115/151] Installing krb5-libs-0:1.21.3 100% | 191.0 MiB/s |   2.3 MiB |  00m00s
[116/151] Installing libssh-0:0.11.3-1. 100% | 185.3 MiB/s | 569.2 KiB |  00m00s
[117/151] Installing libtool-ltdl-0:2.5 100% |  69.6 MiB/s |  71.2 KiB |  00m00s
[118/151] Installing gdbm-libs-1:1.23-9 100% |  64.2 MiB/s | 131.6 KiB |  00m00s
[119/151] Installing cyrus-sasl-lib-0:2 100% | 104.7 MiB/s |   2.3 MiB |  00m00s
[120/151] Installing openldap-0:2.6.10- 100% | 128.8 MiB/s | 659.6 KiB |  00m00s
[121/151] Installing libcurl-0:8.11.1-6 100% | 203.9 MiB/s | 835.2 KiB |  00m00s
[122/151] Installing elfutils-debuginfo 100% |   6.0 MiB/s |  86.2 KiB |  00m00s
[123/151] Installing elfutils-0:0.193-2 100% | 121.8 MiB/s |   2.9 MiB |  00m00s
[124/151] Installing binutils-0:2.44-6. 100% | 228.7 MiB/s |  25.8 MiB |  00m00s
[125/151] Installing gdb-minimal-0:16.3 100% | 232.4 MiB/s |  13.2 MiB |  00m00s
[126/151] Installing debugedit-0:5.1-7. 100% |  12.7 MiB/s | 195.4 KiB |  00m00s
[127/151] Installing curl-0:8.11.1-6.fc 100% |  12.0 MiB/s | 453.1 KiB |  00m00s
[128/151] Installing rpm-0:4.20.1-1.fc4 100% |  58.1 MiB/s |   2.5 MiB |  00m00s
[129/151] Installing lua-srpm-macros-0: 100% |   1.9 MiB/s |   1.9 KiB |  00m00s
[130/151] Installing tree-sitter-srpm-m 100% |   7.2 MiB/s |   7.4 KiB |  00m00s
[131/151] Installing zig-srpm-macros-0: 100% |   1.6 MiB/s |   1.7 KiB |  00m00s
[132/151] Installing efi-srpm-macros-0: 100% |  40.2 MiB/s |  41.1 KiB |  00m00s
[133/151] Installing perl-srpm-macros-0 100% |   0.0   B/s |   1.1 KiB |  00m00s
[134/151] Installing package-notes-srpm 100% |   2.0 MiB/s |   2.0 KiB |  00m00s
[135/151] Installing openblas-srpm-macr 100% |   0.0   B/s | 392.0   B |  00m00s
[136/151] Installing ocaml-srpm-macros- 100% |   2.1 MiB/s |   2.2 KiB |  00m00s
[137/151] Installing kernel-srpm-macros 100% |   2.3 MiB/s |   2.3 KiB |  00m00s
[138/151] Installing gnat-srpm-macros-0 100% |   1.2 MiB/s |   1.3 KiB |  00m00s
[139/151] Installing ghc-srpm-macros-0: 100% |   1.0 MiB/s |   1.0 KiB |  00m00s
[140/151] Installing fpc-srpm-macros-0: 100% |   0.0   B/s | 420.0   B |  00m00s
[141/151] Installing ansible-srpm-macro 100% |  35.4 MiB/s |  36.2 KiB |  00m00s
[142/151] Installing forge-srpm-macros- 100% |  39.3 MiB/s |  40.3 KiB |  00m00s
[143/151] Installing fonts-srpm-macros- 100% |  55.7 MiB/s |  57.0 KiB |  00m00s
[144/151] Installing go-srpm-macros-0:3 100% |  61.6 MiB/s |  63.0 KiB |  00m00s
[145/151] Installing python-srpm-macros 100% |  50.9 MiB/s |  52.2 KiB |  00m00s
[146/151] Installing redhat-rpm-config- 100% |  46.9 MiB/s | 192.2 KiB |  00m00s
[147/151] Installing rpm-build-0:4.20.1 100% |  10.2 MiB/s | 177.4 KiB |  00m00s
[148/151] Installing pyproject-srpm-mac 100% |   1.2 MiB/s |   2.5 KiB |  00m00s
[149/151] Installing util-linux-0:2.40. 100% |  58.7 MiB/s |   3.5 MiB |  00m00s
[150/151] Installing which-0:2.23-2.fc4 100% |   5.6 MiB/s |  85.7 KiB |  00m00s
[151/151] Installing info-0:7.2-3.fc42. 100% | 134.4 KiB/s | 358.3 KiB |  00m03s
Complete!
Finish: installing minimal buildroot with dnf5
Start: creating root cache
Finish: creating root cache
Finish: chroot init
INFO: Installed packages:
INFO: add-determinism-0.6.0-1.fc42.x86_64
alternatives-1.33-1.fc42.x86_64
ansible-srpm-macros-1-17.1.fc42.noarch
audit-libs-4.1.1-1.fc42.x86_64
basesystem-11-22.fc42.noarch
bash-5.2.37-1.fc42.x86_64
binutils-2.44-6.fc42.x86_64
build-reproducibility-srpm-macros-0.6.0-1.fc42.noarch
bzip2-1.0.8-20.fc42.x86_64
bzip2-libs-1.0.8-20.fc42.x86_64
ca-certificates-2025.2.80_v9.0.304-1.0.fc42.noarch
coreutils-9.6-6.fc42.x86_64
coreutils-common-9.6-6.fc42.x86_64
cpio-2.15-4.fc42.x86_64
crypto-policies-20250707-1.gitad370a8.fc42.noarch
curl-8.11.1-6.fc42.x86_64
cyrus-sasl-lib-2.1.28-30.fc42.x86_64
debugedit-5.1-7.fc42.x86_64
diffutils-3.12-1.fc42.x86_64
dwz-0.16-1.fc42.x86_64
ed-1.21-2.fc42.x86_64
efi-srpm-macros-6-3.fc42.noarch
elfutils-0.193-2.fc42.x86_64
elfutils-debuginfod-client-0.193-2.fc42.x86_64
elfutils-default-yama-scope-0.193-2.fc42.noarch
elfutils-libelf-0.193-2.fc42.x86_64
elfutils-libs-0.193-2.fc42.x86_64
fedora-gpg-keys-42-1.noarch
fedora-release-42-30.noarch
fedora-release-common-42-30.noarch
fedora-release-identity-basic-42-30.noarch
fedora-repos-42-1.noarch
file-5.46-3.fc42.x86_64
file-libs-5.46-3.fc42.x86_64
filesystem-3.18-47.fc42.x86_64
filesystem-srpm-macros-3.18-47.fc42.noarch
findutils-4.10.0-5.fc42.x86_64
fonts-srpm-macros-2.0.5-22.fc42.noarch
forge-srpm-macros-0.4.0-2.fc42.noarch
fpc-srpm-macros-1.3-14.fc42.noarch
gawk-5.3.1-1.fc42.x86_64
gdb-minimal-16.3-1.fc42.x86_64
gdbm-libs-1.23-9.fc42.x86_64
ghc-srpm-macros-1.9.2-2.fc42.noarch
glibc-2.41-11.fc42.x86_64
glibc-common-2.41-11.fc42.x86_64
glibc-gconv-extra-2.41-11.fc42.x86_64
glibc-minimal-langpack-2.41-11.fc42.x86_64
gmp-6.3.0-4.fc42.x86_64
gnat-srpm-macros-6-7.fc42.noarch
gnulib-l10n-20241231-1.fc42.noarch
go-srpm-macros-3.8.0-1.fc42.noarch
gpg-pubkey-105ef944-65ca83d1
grep-3.11-10.fc42.x86_64
gzip-1.13-3.fc42.x86_64
info-7.2-3.fc42.x86_64
jansson-2.14-2.fc42.x86_64
json-c-0.18-2.fc42.x86_64
kernel-srpm-macros-1.0-25.fc42.noarch
keyutils-libs-1.6.3-5.fc42.x86_64
krb5-libs-1.21.3-6.fc42.x86_64
libacl-2.3.2-3.fc42.x86_64
libarchive-3.8.1-1.fc42.x86_64
libattr-2.5.2-5.fc42.x86_64
libblkid-2.40.4-7.fc42.x86_64
libbrotli-1.1.0-6.fc42.x86_64
libcap-2.73-2.fc42.x86_64
libcap-ng-0.8.5-4.fc42.x86_64
libcom_err-1.47.2-3.fc42.x86_64
libcurl-8.11.1-6.fc42.x86_64
libeconf-0.7.6-2.fc42.x86_64
libevent-2.1.12-15.fc42.x86_64
libfdisk-2.40.4-7.fc42.x86_64
libffi-3.4.6-5.fc42.x86_64
libgcc-15.2.1-1.fc42.x86_64
libgomp-15.2.1-1.fc42.x86_64
libidn2-2.3.8-1.fc42.x86_64
libmount-2.40.4-7.fc42.x86_64
libnghttp2-1.64.0-3.fc42.x86_64
libpkgconf-2.3.0-2.fc42.x86_64
libpsl-0.21.5-5.fc42.x86_64
libselinux-3.8-3.fc42.x86_64
libsemanage-3.8.1-2.fc42.x86_64
libsepol-3.8-1.fc42.x86_64
libsmartcols-2.40.4-7.fc42.x86_64
libssh-0.11.3-1.fc42.x86_64
libssh-config-0.11.3-1.fc42.noarch
libstdc++-15.2.1-1.fc42.x86_64
libtasn1-4.20.0-1.fc42.x86_64
libtool-ltdl-2.5.4-4.fc42.x86_64
libunistring-1.1-9.fc42.x86_64
libuuid-2.40.4-7.fc42.x86_64
libverto-0.3.2-10.fc42.x86_64
libxcrypt-4.4.38-7.fc42.x86_64
libxml2-2.12.10-1.fc42.x86_64
libzstd-1.5.7-1.fc42.x86_64
lua-libs-5.4.8-1.fc42.x86_64
lua-srpm-macros-1-15.fc42.noarch
lz4-libs-1.10.0-2.fc42.x86_64
mpfr-4.2.2-1.fc42.x86_64
ncurses-base-6.5-5.20250125.fc42.noarch
ncurses-libs-6.5-5.20250125.fc42.x86_64
ocaml-srpm-macros-10-4.fc42.noarch
openblas-srpm-macros-2-19.fc42.noarch
openldap-2.6.10-1.fc42.x86_64
openssl-libs-3.2.4-4.fc42.x86_64
p11-kit-0.25.8-1.fc42.x86_64
p11-kit-trust-0.25.8-1.fc42.x86_64
package-notes-srpm-macros-0.5-13.fc42.noarch
pam-libs-1.7.0-6.fc42.x86_64
patch-2.8-1.fc42.x86_64
pcre2-10.45-1.fc42.x86_64
pcre2-syntax-10.45-1.fc42.noarch
perl-srpm-macros-1-57.fc42.noarch
pkgconf-2.3.0-2.fc42.x86_64
pkgconf-m4-2.3.0-2.fc42.noarch
pkgconf-pkg-config-2.3.0-2.fc42.x86_64
popt-1.19-8.fc42.x86_64
publicsuffix-list-dafsa-20250616-1.fc42.noarch
pyproject-srpm-macros-1.18.4-1.fc42.noarch
python-srpm-macros-3.13-5.fc42.noarch
qt5-srpm-macros-5.15.17-1.fc42.noarch
qt6-srpm-macros-6.9.2-1.fc42.noarch
readline-8.2-13.fc42.x86_64
redhat-rpm-config-342-4.fc42.noarch
rpm-4.20.1-1.fc42.x86_64
rpm-build-4.20.1-1.fc42.x86_64
rpm-build-libs-4.20.1-1.fc42.x86_64
rpm-libs-4.20.1-1.fc42.x86_64
rpm-sequoia-1.7.0-5.fc42.x86_64
rust-srpm-macros-26.4-1.fc42.noarch
sed-4.9-4.fc42.x86_64
setup-2.15.0-13.fc42.noarch
shadow-utils-4.17.4-1.fc42.x86_64
sqlite-libs-3.47.2-5.fc42.x86_64
systemd-libs-257.9-2.fc42.x86_64
systemd-standalone-sysusers-257.9-2.fc42.x86_64
tar-1.35-5.fc42.x86_64
tree-sitter-srpm-macros-0.1.0-8.fc42.noarch
unzip-6.0-66.fc42.x86_64
util-linux-2.40.4-7.fc42.x86_64
util-linux-core-2.40.4-7.fc42.x86_64
which-2.23-2.fc42.x86_64
xxhash-libs-0.8.3-2.fc42.x86_64
xz-5.8.1-2.fc42.x86_64
xz-libs-5.8.1-2.fc42.x86_64
zig-srpm-macros-1-4.fc42.noarch
zip-3.0-43.fc42.x86_64
zlib-ng-compat-2.2.5-2.fc42.x86_64
zstd-1.5.7-1.fc42.x86_64
Start: buildsrpm
Start: rpmbuild -bs
Building target platforms: x86_64
Building for target x86_64
setting SOURCE_DATE_EPOCH=1759363200
Wrote: /builddir/build/SRPMS/ollama-ggml-cuda-0.12.3-1.fc42.src.rpm
Finish: rpmbuild -bs
INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan
INFO: /var/lib/mock/fedora-42-x86_64-1759428480.475249/root/var/log/dnf5.log
INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz
/bin/tar: Removing leading `/' from member names
Finish: buildsrpm
INFO: Done(/var/lib/copr-rpmbuild/workspace/workdir-cnrrwoue/ollama-ggml-cuda/ollama-ggml-cuda.spec) Config(child) 1 minutes 3 seconds
INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results
INFO: Cleaning up build root ('cleanup_on_success=True')
Start: clean chroot
INFO: unmounting tmpfs.
Finish: clean chroot
INFO: Start(/var/lib/copr-rpmbuild/results/ollama-ggml-cuda-0.12.3-1.fc42.src.rpm)  Config(fedora-42-x86_64)
Start(bootstrap): chroot init
INFO: mounting tmpfs at /var/lib/mock/fedora-42-x86_64-bootstrap-1759428480.475249/root.
INFO: reusing tmpfs at /var/lib/mock/fedora-42-x86_64-bootstrap-1759428480.475249/root.
INFO: calling preinit hooks
INFO: enabled root cache
INFO: enabled package manager cache
Start(bootstrap): cleaning package manager metadata
Finish(bootstrap): cleaning package manager metadata
Finish(bootstrap): chroot init
Start: chroot init
INFO: mounting tmpfs at /var/lib/mock/fedora-42-x86_64-1759428480.475249/root.
INFO: calling preinit hooks
INFO: enabled root cache
Start: unpacking root cache
Finish: unpacking root cache
INFO: enabled package manager cache
Start: cleaning package manager metadata
Finish: cleaning package manager metadata
INFO: enabled HW Info plugin
INFO: Buildroot is handled by package management downloaded with a bootstrap image:
  rpm-4.20.1-1.fc42.x86_64
  rpm-sequoia-1.7.0-5.fc42.x86_64
  dnf5-5.2.16.0-1.fc42.x86_64
  dnf5-plugins-5.2.16.0-1.fc42.x86_64
Finish: chroot init
Start: build phase for ollama-ggml-cuda-0.12.3-1.fc42.src.rpm
Start: build setup for ollama-ggml-cuda-0.12.3-1.fc42.src.rpm
Building target platforms: x86_64
Building for target x86_64
setting SOURCE_DATE_EPOCH=1759363200
Wrote: /builddir/build/SRPMS/ollama-ggml-cuda-0.12.3-1.fc42.src.rpm
Updating and loading repositories:
 Additional repo https_developer_downlo 100% |  12.9 KiB/s |   3.9 KiB |  00m00s
 Additional repo https_developer_downlo 100% |  12.9 KiB/s |   3.9 KiB |  00m00s
 Copr repository                        100% |   4.9 KiB/s |   1.5 KiB |  00m00s
 fedora                                 100% |  82.3 KiB/s |  30.9 KiB |  00m00s
 updates                                100% |  87.9 KiB/s |  29.8 KiB |  00m00s
Repositories loaded.
Package                          Arch   Version           Repository                                                                 Size
Installing:
 cmake                           x86_64 3.31.6-2.fc42     fedora                                                                 34.2 MiB
 cuda-compiler-12-9              x86_64 12.9.1-1          https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   0.0   B
 cuda-compiler-13-0              x86_64 13.0.1-1          https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   0.0   B
 cuda-libraries-devel-12-9       x86_64 12.9.1-1          https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   0.0   B
 cuda-libraries-devel-13-0       x86_64 13.0.1-1          https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   0.0   B
 cuda-nvml-devel-12-9            x86_64 12.9.79-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   1.4 MiB
 cuda-nvml-devel-13-0            x86_64 13.0.87-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   1.4 MiB
 gcc-c++                         x86_64 15.2.1-1.fc42     updates                                                                41.3 MiB
 gcc14                           x86_64 14.2.1-8.fc42     fedora                                                                117.2 MiB
 gcc14-c++                       x86_64 14.2.1-8.fc42     fedora                                                                 59.6 MiB
Installing dependencies:
 annobin-docs                    noarch 12.94-1.fc42      updates                                                                98.9 KiB
 annobin-plugin-gcc              x86_64 12.94-1.fc42      updates                                                               993.5 KiB
 cmake-data                      noarch 3.31.6-2.fc42     fedora                                                                  8.5 MiB
 cmake-filesystem                x86_64 3.31.6-2.fc42     fedora                                                                  0.0   B
 cmake-rpm-macros                noarch 3.31.6-2.fc42     fedora                                                                  7.7 KiB
 cpp                             x86_64 15.2.1-1.fc42     updates                                                                37.9 MiB
 cuda-cccl-12-9                  x86_64 12.9.27-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64  12.7 MiB
 cuda-cccl-13-0                  x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  13.2 MiB
 cuda-crt-12-9                   x86_64 12.9.86-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 928.8 KiB
 cuda-crt-13-0                   x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 936.8 KiB
 cuda-cudart-12-9                x86_64 12.9.79-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 785.8 KiB
 cuda-cudart-13-0                x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 754.1 KiB
 cuda-cudart-devel-12-9          x86_64 12.9.79-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   8.5 MiB
 cuda-cudart-devel-13-0          x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   6.2 MiB
 cuda-culibos-devel-13-0         x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  96.4 KiB
 cuda-cuobjdump-12-9             x86_64 12.9.82-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 665.7 KiB
 cuda-cuobjdump-13-0             x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 750.4 KiB
 cuda-cuxxfilt-12-9              x86_64 12.9.82-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   1.0 MiB
 cuda-cuxxfilt-13-0              x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   1.0 MiB
 cuda-driver-devel-12-9          x86_64 12.9.79-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 131.0 KiB
 cuda-driver-devel-13-0          x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 135.3 KiB
 cuda-nvcc-12-9                  x86_64 12.9.86-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 317.8 MiB
 cuda-nvcc-13-0                  x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 111.0 MiB
 cuda-nvprune-12-9               x86_64 12.9.82-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 181.0 KiB
 cuda-nvprune-13-0               x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 181.3 KiB
 cuda-nvrtc-12-9                 x86_64 12.9.86-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 216.9 MiB
 cuda-nvrtc-13-0                 x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 217.4 MiB
 cuda-nvrtc-devel-12-9           x86_64 12.9.86-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 248.0 MiB
 cuda-nvrtc-devel-13-0           x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 244.5 MiB
 cuda-nvvm-12-9                  x86_64 12.9.86-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 132.6 MiB
 cuda-opencl-12-9                x86_64 12.9.19-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64  91.7 KiB
 cuda-opencl-13-0                x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  96.5 KiB
 cuda-opencl-devel-12-9          x86_64 12.9.19-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 741.1 KiB
 cuda-opencl-devel-13-0          x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 747.9 KiB
 cuda-profiler-api-12-9          x86_64 12.9.79-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64  73.4 KiB
 cuda-profiler-api-13-0          x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  77.6 KiB
 cuda-sandbox-devel-12-9         x86_64 12.9.19-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 146.3 KiB
 cuda-sandbox-devel-13-0         x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 149.4 KiB
 cuda-toolkit-12-9-config-common noarch 12.9.79-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   0.0   B
 cuda-toolkit-12-config-common   noarch 12.9.79-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64  44.0   B
 cuda-toolkit-13-0-config-common noarch 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   0.0   B
 cuda-toolkit-13-config-common   noarch 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  44.0   B
 cuda-toolkit-config-common      noarch 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  41.0   B
 emacs-filesystem                noarch 1:30.0-4.fc42     fedora                                                                  0.0   B
 expat                           x86_64 2.7.2-1.fc42      updates                                                               298.6 KiB
 gcc                             x86_64 15.2.1-1.fc42     updates                                                               111.2 MiB
 gcc-plugin-annobin              x86_64 15.2.1-1.fc42     updates                                                                57.1 KiB
 glibc-devel                     x86_64 2.41-11.fc42      updates                                                                 2.3 MiB
 jsoncpp                         x86_64 1.9.6-1.fc42      fedora                                                                261.6 KiB
 kernel-headers                  x86_64 6.16.2-200.fc42   updates                                                                 6.7 MiB
 libb2                           x86_64 0.98.1-13.fc42    fedora                                                                 46.1 KiB
 libcublas-12-9                  x86_64 12.9.1.4-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 815.6 MiB
 libcublas-13-0                  x86_64 13.0.2.14-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 567.2 MiB
 libcublas-devel-12-9            x86_64 12.9.1.4-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   1.2 GiB
 libcublas-devel-13-0            x86_64 13.0.2.14-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 961.6 MiB
 libcufft-12-9                   x86_64 11.4.1.4-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 277.2 MiB
 libcufft-13-0                   x86_64 12.0.0.61-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 274.3 MiB
 libcufft-devel-12-9             x86_64 11.4.1.4-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 567.3 MiB
 libcufft-devel-13-0             x86_64 12.0.0.61-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 280.5 MiB
 libcufile-12-9                  x86_64 1.14.1.1-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   3.2 MiB
 libcufile-13-0                  x86_64 1.15.1.6-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   3.2 MiB
 libcufile-devel-12-9            x86_64 1.14.1.1-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64  27.9 MiB
 libcufile-devel-13-0            x86_64 1.15.1.6-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  27.9 MiB
 libcurand-12-9                  x86_64 10.3.10.19-1      https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 159.3 MiB
 libcurand-13-0                  x86_64 10.4.0.35-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 126.6 MiB
 libcurand-devel-12-9            x86_64 10.3.10.19-1      https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 161.3 MiB
 libcurand-devel-13-0            x86_64 10.4.0.35-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 129.0 MiB
 libcusolver-12-9                x86_64 11.7.5.82-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 470.6 MiB
 libcusolver-13-0                x86_64 12.0.4.66-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 233.8 MiB
 libcusolver-devel-12-9          x86_64 11.7.5.82-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 332.5 MiB
 libcusolver-devel-13-0          x86_64 12.0.4.66-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 180.9 MiB
 libcusparse-12-9                x86_64 12.5.10.65-1      https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 463.0 MiB
 libcusparse-13-0                x86_64 12.6.3.3-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 155.1 MiB
 libcusparse-devel-12-9          x86_64 12.5.10.65-1      https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 960.3 MiB
 libcusparse-devel-13-0          x86_64 12.6.3.3-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 348.7 MiB
 libmpc                          x86_64 1.3.1-7.fc42      fedora                                                                164.5 KiB
 libnpp-12-9                     x86_64 12.4.1.87-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 393.0 MiB
 libnpp-13-0                     x86_64 13.0.1.2-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 157.3 MiB
 libnpp-devel-12-9               x86_64 12.4.1.87-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 406.2 MiB
 libnpp-devel-13-0               x86_64 13.0.1.2-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 184.5 MiB
 libnvfatbin-12-9                x86_64 12.9.82-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   2.4 MiB
 libnvfatbin-13-0                x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   2.4 MiB
 libnvfatbin-devel-12-9          x86_64 12.9.82-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   2.3 MiB
 libnvfatbin-devel-13-0          x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   2.3 MiB
 libnvjitlink-12-9               x86_64 12.9.86-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64  91.6 MiB
 libnvjitlink-13-0               x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  94.3 MiB
 libnvjitlink-devel-12-9         x86_64 12.9.86-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 127.6 MiB
 libnvjitlink-devel-13-0         x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 130.0 MiB
 libnvjpeg-12-9                  x86_64 12.4.0.76-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   9.0 MiB
 libnvjpeg-13-0                  x86_64 13.0.1.86-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   5.7 MiB
 libnvjpeg-devel-12-9            x86_64 12.4.0.76-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   9.4 MiB
 libnvjpeg-devel-13-0            x86_64 13.0.1.86-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   6.4 MiB
 libnvptxcompiler-13-0           x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  85.4 MiB
 libnvvm-13-0                    x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 133.6 MiB
 libstdc++-devel                 x86_64 15.2.1-1.fc42     updates                                                                16.1 MiB
 libuv                           x86_64 1:1.51.0-1.fc42   updates                                                               570.2 KiB
 libxcrypt-devel                 x86_64 4.4.38-7.fc42     updates                                                                30.8 KiB
 make                            x86_64 1:4.4.1-10.fc42   fedora                                                                  1.8 MiB
 mpdecimal                       x86_64 4.0.1-1.fc42      updates                                                               217.2 KiB
 python-pip-wheel                noarch 24.3.1-5.fc42     updates                                                                 1.2 MiB
 python3                         x86_64 3.13.7-1.fc42     updates                                                                28.7 KiB
 python3-libs                    x86_64 3.13.7-1.fc42     updates                                                                40.1 MiB
 rhash                           x86_64 1.4.5-2.fc42      fedora                                                                351.0 KiB
 tzdata                          noarch 2025b-1.fc42      fedora                                                                  1.6 MiB
 vim-filesystem                  noarch 2:9.1.1775-1.fc42 updates                                                                40.0   B

Transaction Summary:
 Installing:       115 packages

Total size of inbound packages is 7 GiB. Need to download 7 GiB.
After this operation, 12 GiB extra will be used (install 12 GiB, remove 0 B).
[  1/115] cmake-0:3.31.6-2.fc42.x86_64  100% |   2.6 MiB/s |  12.2 MiB |  00m05s
[  2/115] cuda-compiler-12-9-0:12.9.1-1 100% |  28.1 KiB/s |   7.4 KiB |  00m00s
[  3/115] cuda-compiler-13-0-0:13.0.1-1 100% |  70.3 KiB/s |   7.5 KiB |  00m00s
[  4/115] cuda-libraries-devel-12-9-0:1 100% |  69.3 KiB/s |   7.9 KiB |  00m00s
[  5/115] cuda-libraries-devel-13-0-0:1 100% |  77.8 KiB/s |   7.9 KiB |  00m00s
[  6/115] cuda-nvml-devel-12-9-0:12.9.7 100% | 736.8 KiB/s | 201.2 KiB |  00m00s
[  7/115] cuda-nvml-devel-13-0-0:13.0.8 100% | 699.2 KiB/s | 218.9 KiB |  00m00s
[  8/115] gcc-c++-0:15.2.1-1.fc42.x86_6 100% |  46.0 MiB/s |  15.3 MiB |  00m00s
[  9/115] libmpc-0:1.3.1-7.fc42.x86_64  100% | 417.1 KiB/s |  70.9 KiB |  00m00s
[ 10/115] make-1:4.4.1-10.fc42.x86_64   100% |   1.3 MiB/s | 587.0 KiB |  00m00s
[ 11/115] cmake-data-0:3.31.6-2.fc42.no 100% |   2.5 MiB/s |   2.5 MiB |  00m01s
[ 12/115] cmake-filesystem-0:3.31.6-2.f 100% | 137.4 KiB/s |  17.6 KiB |  00m00s
[ 13/115] jsoncpp-0:1.9.6-1.fc42.x86_64 100% | 915.9 KiB/s | 103.5 KiB |  00m00s
[ 14/115] rhash-0:1.4.5-2.fc42.x86_64   100% |   1.1 MiB/s | 198.7 KiB |  00m00s
[ 15/115] cuda-cuobjdump-12-9-0:12.9.82 100% | 905.2 KiB/s | 277.9 KiB |  00m00s
[ 16/115] cuda-cuxxfilt-12-9-0:12.9.82- 100% |   1.2 MiB/s | 282.8 KiB |  00m00s
[ 17/115] cuda-nvcc-12-9-0:12.9.86-1.x8 100% |  26.3 MiB/s | 111.3 MiB |  00m04s
[ 18/115] cuda-nvprune-12-9-0:12.9.82-1 100% | 623.0 KiB/s |  76.0 KiB |  00m00s
[ 19/115] cuda-crt-13-0-0:13.0.88-1.x86 100% | 983.2 KiB/s | 120.9 KiB |  00m00s
[ 20/115] cuda-cuobjdump-13-0-0:13.0.85 100% |   2.3 MiB/s | 309.5 KiB |  00m00s
[ 21/115] cuda-cuxxfilt-13-0-0:13.0.85- 100% |   2.2 MiB/s | 283.6 KiB |  00m00s
[ 22/115] cuda-nvcc-13-0-0:13.0.88-1.x8 100% |  22.1 MiB/s |  35.3 MiB |  00m02s
[ 23/115] cuda-nvprune-13-0-0:13.0.85-1 100% | 554.9 KiB/s |  76.6 KiB |  00m00s
[ 24/115] libnvptxcompiler-13-0-0:13.0. 100% |  25.9 MiB/s |  21.3 MiB |  00m01s
[ 25/115] gcc14-c++-0:14.2.1-8.fc42.x86 100% |   1.0 MiB/s |  16.8 MiB |  00m17s
[ 26/115] cuda-cccl-12-9-0:12.9.27-1.x8 100% |   3.3 MiB/s |   1.7 MiB |  00m01s
[ 27/115] cuda-cudart-devel-12-9-0:12.9 100% |   8.0 MiB/s |   3.0 MiB |  00m00s
[ 28/115] cuda-driver-devel-12-9-0:12.9 100% | 116.1 KiB/s |  43.1 KiB |  00m00s
[ 29/115] libnvvm-13-0-0:13.0.88-1.x86_ 100% |  27.7 MiB/s |  58.3 MiB |  00m02s
[ 30/115] cuda-opencl-devel-12-9-0:12.9 100% | 307.0 KiB/s | 119.4 KiB |  00m00s
[ 31/115] cuda-profiler-api-12-9-0:12.9 100% | 127.3 KiB/s |  26.2 KiB |  00m00s
[ 32/115] cuda-sandbox-devel-12-9-0:12. 100% | 159.1 KiB/s |  44.2 KiB |  00m00s
[ 33/115] cuda-nvrtc-devel-12-9-0:12.9. 100% |  18.2 MiB/s |  74.2 MiB |  00m04s
[ 34/115] libcufft-devel-12-9-0:11.4.1. 100% |  18.2 MiB/s | 385.6 MiB |  00m21s
[ 35/115] libcufile-devel-12-9-0:1.14.1 100% |   6.8 MiB/s |   5.2 MiB |  00m01s
[ 36/115] libcurand-devel-12-9-0:10.3.1 100% |  17.0 MiB/s |  64.2 MiB |  00m04s
[ 37/115] libcublas-devel-12-9-0:12.9.1 100% |  19.1 MiB/s | 630.3 MiB |  00m33s
[ 38/115] libcusolver-devel-12-9-0:11.7 100% |  18.3 MiB/s | 213.1 MiB |  00m12s
[ 39/115] libnpp-devel-12-9-0:12.4.1.87 100% |  20.8 MiB/s | 268.0 MiB |  00m13s
[ 40/115] libnvfatbin-devel-12-9-0:12.9 100% |   3.5 MiB/s | 863.8 KiB |  00m00s
[ 41/115] gcc14-0:14.2.1-8.fc42.x86_64  100% | 608.8 KiB/s |  43.8 MiB |  01m14s
[ 42/115] libnvjpeg-devel-12-9-0:12.4.0 100% |   4.1 MiB/s |   4.9 MiB |  00m01s
[ 43/115] libnvjitlink-devel-12-9-0:12. 100% |  14.9 MiB/s |  36.1 MiB |  00m02s
[ 44/115] cuda-cccl-13-0-0:13.0.85-1.x8 100% |   3.3 MiB/s |   1.7 MiB |  00m01s
[ 45/115] cuda-culibos-devel-13-0-0:13. 100% | 204.4 KiB/s |  32.5 KiB |  00m00s
[ 46/115] cuda-cudart-devel-13-0-0:13.0 100% |   3.7 MiB/s |   1.9 MiB |  00m01s
[ 47/115] cuda-driver-devel-13-0-0:13.0 100% | 136.6 KiB/s |  44.3 KiB |  00m00s
[ 48/115] cuda-opencl-devel-13-0-0:13.0 100% | 202.0 KiB/s | 120.8 KiB |  00m01s
[ 49/115] cuda-profiler-api-13-0-0:13.0 100% |  52.7 KiB/s |  27.1 KiB |  00m01s
[ 50/115] cuda-sandbox-devel-13-0-0:13. 100% | 256.1 KiB/s |  45.3 KiB |  00m00s
[ 51/115] cuda-nvrtc-devel-13-0-0:13.0. 100% |  16.4 MiB/s |  73.7 MiB |  00m04s
[ 52/115] libcufft-devel-13-0-0:12.0.0. 100% |  13.7 MiB/s | 205.4 MiB |  00m15s
[ 53/115] libcufile-devel-13-0-0:1.15.1 100% |   9.6 MiB/s |   5.2 MiB |  00m01s
[ 54/115] libcusparse-devel-12-9-0:12.5 100% |  15.3 MiB/s | 710.9 MiB |  00m46s
[ 55/115] libcurand-devel-13-0-0:10.4.0 100% |  12.6 MiB/s |  56.0 MiB |  00m04s
[ 56/115] libcusolver-devel-13-0-0:12.0 100% |  13.4 MiB/s | 124.4 MiB |  00m09s
[ 57/115] libcublas-devel-13-0-0:13.0.2 100% |  13.5 MiB/s | 470.7 MiB |  00m35s
[ 58/115] libnvfatbin-devel-13-0-0:13.0 100% | 846.1 KiB/s | 877.4 KiB |  00m01s
[ 59/115] libnvjitlink-devel-13-0-0:13. 100% |   8.0 MiB/s |  36.7 MiB |  00m05s
[ 60/115] libnpp-devel-13-0-0:13.0.1.2- 100% |  11.6 MiB/s | 125.6 MiB |  00m11s
[ 61/115] libnvjpeg-devel-13-0-0:13.0.1 100% |   3.5 MiB/s |   3.4 MiB |  00m01s
[ 62/115] emacs-filesystem-1:30.0-4.fc4 100% |  26.1 KiB/s |   7.4 KiB |  00m00s
[ 63/115] gcc-0:15.2.1-1.fc42.x86_64    100% |  59.0 MiB/s |  39.4 MiB |  00m01s
[ 64/115] cuda-crt-12-9-0:12.9.86-1.x86 100% | 295.5 KiB/s | 119.7 KiB |  00m00s
[ 65/115] cuda-cudart-12-9-0:12.9.79-1. 100% | 539.4 KiB/s | 236.8 KiB |  00m00s
[ 66/115] libcusparse-devel-13-0-0:12.6 100% |  14.0 MiB/s | 286.7 MiB |  00m20s
[ 67/115] cuda-opencl-12-9-0:12.9.19-1. 100% | 342.4 KiB/s |  34.2 KiB |  00m00s
[ 68/115] cuda-nvvm-12-9-0:12.9.86-1.x8 100% |  13.5 MiB/s |  57.6 MiB |  00m04s
[ 69/115] cuda-nvrtc-12-9-0:12.9.86-1.x 100% |  13.4 MiB/s |  84.8 MiB |  00m06s
[ 70/115] libcufile-12-9-0:1.14.1.1-1.x 100% | 962.1 KiB/s |   1.2 MiB |  00m01s
[ 71/115] libcurand-12-9-0:10.3.10.19-1 100% |  12.4 MiB/s |  63.9 MiB |  00m05s
[ 72/115] libcufft-12-9-0:11.4.1.4-1.x8 100% |  12.8 MiB/s | 191.7 MiB |  00m15s
[ 73/115] libcusolver-12-9-0:11.7.5.82- 100% |  11.9 MiB/s | 324.9 MiB |  00m27s
[ 74/115] libcublas-12-9-0:12.9.1.4-1.x 100% |  12.7 MiB/s | 555.4 MiB |  00m44s
[ 75/115] libnvfatbin-12-9-0:12.9.82-1. 100% |   1.5 MiB/s | 940.1 KiB |  00m01s
[ 76/115] libcusparse-12-9-0:12.5.10.65 100% |  12.8 MiB/s | 351.7 MiB |  00m27s
[ 77/115] libnvjpeg-12-9-0:12.4.0.76-1. 100% |   5.6 MiB/s |   5.1 MiB |  00m01s
[ 78/115] cuda-cudart-13-0-0:13.0.88-1. 100% | 402.0 KiB/s | 223.1 KiB |  00m01s
[ 79/115] libnvjitlink-12-9-0:12.9.86-1 100% |  14.6 MiB/s |  37.6 MiB |  00m03s
[ 80/115] cuda-opencl-13-0-0:13.0.85-1. 100% | 560.1 KiB/s |  35.3 KiB |  00m00s
[ 81/115] cuda-nvrtc-13-0-0:13.0.88-1.x 100% |  11.5 MiB/s |  85.4 MiB |  00m07s
[ 82/115] libnpp-12-9-0:12.4.1.87-1.x86 100% |  13.3 MiB/s | 271.1 MiB |  00m20s
[ 83/115] libcufile-13-0-0:1.15.1.6-1.x 100% |   1.2 MiB/s |   1.2 MiB |  00m01s
[ 84/115] libcurand-13-0-0:10.4.0.35-1. 100% |   7.7 MiB/s |  55.7 MiB |  00m07s
[ 85/115] libcufft-13-0-0:12.0.0.61-1.x 100% |  10.3 MiB/s | 204.4 MiB |  00m20s
[ 86/115] libcusolver-13-0-0:12.0.4.66- 100% |  10.5 MiB/s | 191.4 MiB |  00m18s
[ 87/115] libcusparse-13-0-0:12.6.3.3-1 100% |   9.6 MiB/s | 139.2 MiB |  00m15s
[ 88/115] libcublas-13-0-0:13.0.2.14-1. 100% |   9.6 MiB/s | 401.1 MiB |  00m42s
[ 89/115] libnvfatbin-13-0-0:13.0.85-1. 100% |   1.5 MiB/s | 950.0 KiB |  00m01s
[ 90/115] libnvjpeg-13-0-0:13.0.1.86-1. 100% |   2.9 MiB/s |   3.5 MiB |  00m01s
[ 91/115] cpp-0:15.2.1-1.fc42.x86_64    100% |  21.7 MiB/s |  12.9 MiB |  00m01s
[ 92/115] libstdc++-devel-0:15.2.1-1.fc 100% |  27.6 MiB/s |   2.9 MiB |  00m00s
[ 93/115] glibc-devel-0:2.41-11.fc42.x8 100% |  23.4 MiB/s | 623.2 KiB |  00m00s
[ 94/115] vim-filesystem-2:9.1.1775-1.f 100% |   1.1 MiB/s |  15.4 KiB |  00m00s
[ 95/115] expat-0:2.7.2-1.fc42.x86_64   100% |   7.3 MiB/s | 119.0 KiB |  00m00s
[ 96/115] libuv-1:1.51.0-1.fc42.x86_64  100% |  13.0 MiB/s | 266.3 KiB |  00m00s
[ 97/115] cuda-toolkit-config-common-0: 100% |  29.5 KiB/s |   8.0 KiB |  00m00s
[ 98/115] cuda-toolkit-13-0-config-comm 100% |  20.1 KiB/s |   7.8 KiB |  00m00s
[ 99/115] cuda-toolkit-13-config-common 100% |  17.6 KiB/s |   8.0 KiB |  00m00s
[100/115] cuda-toolkit-12-9-config-comm 100% |  28.3 KiB/s |   7.8 KiB |  00m00s
[101/115] libnvjitlink-13-0-0:13.0.88-1 100% |  10.7 MiB/s |  38.5 MiB |  00m04s
[102/115] kernel-headers-0:6.16.2-200.f 100% |  26.7 MiB/s |   1.7 MiB |  00m00s
[103/115] libxcrypt-devel-0:4.4.38-7.fc 100% |   2.0 MiB/s |  29.4 KiB |  00m00s
[104/115] gcc-plugin-annobin-0:15.2.1-1 100% |   3.4 MiB/s |  55.8 KiB |  00m00s
[105/115] cuda-toolkit-12-config-common 100% |  36.9 KiB/s |   8.0 KiB |  00m00s
[106/115] annobin-plugin-gcc-0:12.94-1. 100% |  30.0 MiB/s | 981.9 KiB |  00m00s
[107/115] annobin-docs-0:12.94-1.fc42.n 100% |   1.7 MiB/s |  90.4 KiB |  00m00s
[108/115] python3-0:3.13.7-1.fc42.x86_6 100% |   2.1 MiB/s |  30.6 KiB |  00m00s
[109/115] cmake-rpm-macros-0:3.31.6-2.f 100% |  67.6 KiB/s |  16.9 KiB |  00m00s
[110/115] python3-libs-0:3.13.7-1.fc42. 100% |  52.9 MiB/s |   9.2 MiB |  00m00s
[111/115] libb2-0:0.98.1-13.fc42.x86_64 100% | 224.6 KiB/s |  25.4 KiB |  00m00s
[112/115] mpdecimal-0:4.0.1-1.fc42.x86_ 100% |   6.3 MiB/s |  97.1 KiB |  00m00s
[113/115] python-pip-wheel-0:24.3.1-5.f 100% |  26.2 MiB/s |   1.2 MiB |  00m00s
[114/115] libnpp-13-0-0:13.0.1.2-1.x86_ 100% |  17.1 MiB/s | 127.8 MiB |  00m07s
[115/115] tzdata-0:2025b-1.fc42.noarch  100% |   1.3 MiB/s | 714.0 KiB |  00m01s
--------------------------------------------------------------------------------
[115/115] Total                         100% |  34.4 MiB/s |   7.2 GiB |  03m34s
Running transaction
[  1/117] Verify package files          100% |   3.0   B/s | 115.0   B |  00m35s
[  2/117] Prepare transaction           100% | 858.0   B/s | 115.0   B |  00m00s
[  3/117] Installing cuda-toolkit-confi 100% |   0.0   B/s | 312.0   B |  00m00s
[  4/117] Installing cuda-toolkit-12-co 100% |   0.0   B/s | 316.0   B |  00m00s
[  5/117] Installing cuda-toolkit-12-9- 100% |   0.0   B/s | 124.0   B |  00m00s
[  6/117] Installing cuda-toolkit-13-co 100% |   0.0   B/s | 316.0   B |  00m00s
[  7/117] Installing cuda-toolkit-13-0- 100% |   0.0   B/s | 124.0   B |  00m00s
[  8/117] Installing cuda-culibos-devel 100% |  94.7 MiB/s |  97.0 KiB |  00m00s
[  9/117] Installing libmpc-0:1.3.1-7.f 100% |  81.1 MiB/s | 166.1 KiB |  00m00s
[ 10/117] Installing make-1:4.4.1-10.fc 100% |  72.0 MiB/s |   1.8 MiB |  00m00s
[ 11/117] Installing expat-0:2.7.2-1.fc 100% |  15.5 MiB/s | 300.7 KiB |  00m00s
[ 12/117] Installing libstdc++-devel-0: 100% | 188.6 MiB/s |  16.2 MiB |  00m00s
[ 13/117] Installing cuda-cccl-13-0-0:1 100% |  99.2 MiB/s |  13.6 MiB |  00m00s
[ 14/117] Installing cuda-cccl-12-9-0:1 100% | 104.5 MiB/s |  13.1 MiB |  00m00s
[ 15/117] Installing libnvvm-13-0-0:13. 100% | 197.6 MiB/s | 133.6 MiB |  00m01s
[ 16/117] Installing libnvptxcompiler-1 100% | 274.7 MiB/s |  85.4 MiB |  00m00s
[ 17/117] Installing cuda-crt-13-0-0:13 100% | 184.0 MiB/s | 942.2 KiB |  00m00s
[ 18/117] Installing cmake-filesystem-0 100% |   2.5 MiB/s |   7.6 KiB |  00m00s
[ 19/117] Installing cpp-0:15.2.1-1.fc4 100% | 269.1 MiB/s |  37.9 MiB |  00m00s
[ 20/117] Installing cuda-sandbox-devel 100% |  74.1 MiB/s | 151.7 KiB |  00m00s
[ 21/117] Installing cuda-cudart-13-0-0 100% |  36.9 MiB/s | 755.6 KiB |  00m00s
[ 22/117] Installing cuda-cudart-devel- 100% | 231.9 MiB/s |   6.3 MiB |  00m00s
[ 23/117] Installing cuda-opencl-13-0-0 100% |   8.0 MiB/s |  98.1 KiB |  00m00s
[ 24/117] Installing cuda-opencl-devel- 100% | 183.4 MiB/s | 751.3 KiB |  00m00s
[ 25/117] Installing libcublas-13-0-0:1 100% | 254.8 MiB/s | 567.2 MiB |  00m02s
[ 26/117] Installing libcublas-devel-13 100% | 276.2 MiB/s | 961.6 MiB |  00m03s
[ 27/117] Installing libcufft-13-0-0:12 100% | 167.4 MiB/s | 274.3 MiB |  00m02s
[ 28/117] Installing libcufft-devel-13- 100% | 172.7 MiB/s | 280.5 MiB |  00m02s
[ 29/117] Installing libcufile-13-0-0:1 100% | 100.4 MiB/s |   3.2 MiB |  00m00s
[ 30/117] Installing libcufile-devel-13 100% | 297.0 MiB/s |  27.9 MiB |  00m00s
[ 31/117] Installing libcurand-13-0-0:1 100% | 277.1 MiB/s | 126.6 MiB |  00m00s
[ 32/117] Installing libcurand-devel-13 100% | 285.9 MiB/s | 129.0 MiB |  00m00s
[ 33/117] Installing libcusolver-13-0-0 100% | 274.8 MiB/s | 233.8 MiB |  00m01s
[ 34/117] Installing libcusolver-devel- 100% | 281.4 MiB/s | 180.9 MiB |  00m01s
[ 35/117] Installing libcusparse-13-0-0 100% | 274.5 MiB/s | 155.1 MiB |  00m01s
[ 36/117] Installing libcusparse-devel- 100% | 397.6 MiB/s | 348.7 MiB |  00m01s
[ 37/117] Installing libnpp-13-0-0:13.0 100% | 254.6 MiB/s | 157.4 MiB |  00m01s
[ 38/117] Installing libnpp-devel-13-0- 100% | 285.6 MiB/s | 184.5 MiB |  00m01s
[ 39/117] Installing libnvfatbin-13-0-0 100% |  83.4 MiB/s |   2.4 MiB |  00m00s
[ 40/117] Installing libnvfatbin-devel- 100% | 180.2 MiB/s |   2.3 MiB |  00m00s
[ 41/117] Installing libnvjitlink-13-0- 100% | 195.2 MiB/s |  94.3 MiB |  00m00s
[ 42/117] Installing libnvjitlink-devel 100% | 234.6 MiB/s | 130.0 MiB |  00m01s
[ 43/117] Installing libnvjpeg-13-0-0:1 100% | 138.2 MiB/s |   5.7 MiB |  00m00s
[ 44/117] Installing libnvjpeg-devel-13 100% | 238.0 MiB/s |   6.4 MiB |  00m00s
[ 45/117] Installing cuda-sandbox-devel 100% | 145.1 MiB/s | 148.6 KiB |  00m00s
[ 46/117] Installing cuda-cudart-12-9-0 100% |  40.5 MiB/s | 787.3 KiB |  00m00s
[ 47/117] Installing cuda-cudart-devel- 100% | 184.4 MiB/s |   8.5 MiB |  00m00s
[ 48/117] Installing cuda-opencl-12-9-0 100% |   6.1 MiB/s |  93.4 KiB |  00m00s
[ 49/117] Installing cuda-opencl-devel- 100% | 181.8 MiB/s | 744.4 KiB |  00m00s
[ 50/117] Installing libcublas-12-9-0:1 100% | 202.9 MiB/s | 815.6 MiB |  00m04s
[ 51/117] Installing libcublas-devel-12 100% | 224.3 MiB/s |   1.2 GiB |  00m05s
[ 52/117] Installing libcufft-12-9-0:11 100% | 170.4 MiB/s | 277.2 MiB |  00m02s
[ 53/117] Installing libcufft-devel-12- 100% | 169.6 MiB/s | 567.3 MiB |  00m03s
[ 54/117] Installing libcufile-12-9-0:1 100% | 101.2 MiB/s |   3.2 MiB |  00m00s
[ 55/117] Installing libcufile-devel-12 100% | 310.1 MiB/s |  27.9 MiB |  00m00s
[ 56/117] Installing libcurand-12-9-0:1 100% | 267.3 MiB/s | 159.3 MiB |  00m01s
[ 57/117] Installing libcurand-devel-12 100% | 226.2 MiB/s | 161.3 MiB |  00m01s
[ 58/117] Installing libcusolver-12-9-0 100% | 146.1 MiB/s | 470.6 MiB |  00m03s
[ 59/117] Installing libcusolver-devel- 100% | 111.2 MiB/s | 332.5 MiB |  00m03s
[ 60/117] Installing libcusparse-12-9-0 100% | 143.0 MiB/s | 463.0 MiB |  00m03s
[ 61/117] Installing libcusparse-devel- 100% | 161.9 MiB/s | 960.3 MiB |  00m06s
[ 62/117] Installing libnpp-12-9-0:12.4 100% | 147.1 MiB/s | 393.0 MiB |  00m03s
[ 63/117] Installing libnpp-devel-12-9- 100% | 154.5 MiB/s | 406.2 MiB |  00m03s
[ 64/117] Installing libnvfatbin-12-9-0 100% |  85.6 MiB/s |   2.4 MiB |  00m00s
[ 65/117] Installing libnvfatbin-devel- 100% | 192.3 MiB/s |   2.3 MiB |  00m00s
[ 66/117] Installing libnvjitlink-12-9- 100% | 207.6 MiB/s |  91.6 MiB |  00m00s
[ 67/117] Installing libnvjitlink-devel 100% | 244.4 MiB/s | 127.6 MiB |  00m01s
[ 68/117] Installing libnvjpeg-12-9-0:1 100% | 136.1 MiB/s |   9.0 MiB |  00m00s
[ 69/117] Installing libnvjpeg-devel-12 100% | 167.7 MiB/s |   9.4 MiB |  00m00s
[ 70/117] Installing python-pip-wheel-0 100% | 103.7 MiB/s |   1.2 MiB |  00m00s
[ 71/117] Installing mpdecimal-0:4.0.1- 100% |  19.4 MiB/s | 218.8 KiB |  00m00s
[ 72/117] Installing tzdata-0:2025b-1.f 100% |  21.5 MiB/s |   1.9 MiB |  00m00s
[ 73/117] Installing libb2-0:0.98.1-13. 100% |   5.1 MiB/s |  47.2 KiB |  00m00s
[ 74/117] Installing python3-libs-0:3.1 100% | 161.7 MiB/s |  40.4 MiB |  00m00s
[ 75/117] Installing python3-0:3.13.7-1 100% |   1.4 MiB/s |  30.5 KiB |  00m00s
[ 76/117] Installing cmake-rpm-macros-0 100% |   4.1 MiB/s |   8.3 KiB |  00m00s
[ 77/117] Installing annobin-docs-0:12. 100% |  24.4 MiB/s | 100.0 KiB |  00m00s
[ 78/117] Installing kernel-headers-0:6 100% | 106.7 MiB/s |   6.8 MiB |  00m00s
[ 79/117] Installing libxcrypt-devel-0: 100% |   8.1 MiB/s |  33.1 KiB |  00m00s
[ 80/117] Installing glibc-devel-0:2.41 100% |  75.2 MiB/s |   2.3 MiB |  00m00s
[ 81/117] Installing gcc-0:15.2.1-1.fc4 100% | 282.5 MiB/s | 111.3 MiB |  00m00s
[ 82/117] Installing gcc-c++-0:15.2.1-1 100% | 261.7 MiB/s |  41.4 MiB |  00m00s
[ 83/117] Installing cuda-nvcc-13-0-0:1 100% | 180.2 MiB/s | 111.0 MiB |  00m01s
[ 84/117] Installing gcc14-0:14.2.1-8.f 100% | 288.8 MiB/s | 117.2 MiB |  00m00s
[ 85/117] Installing libuv-1:1.51.0-1.f 100% | 139.9 MiB/s | 573.0 KiB |  00m00s
[ 86/117] Installing vim-filesystem-2:9 100% |   2.3 MiB/s |   4.7 KiB |  00m00s
[ 87/117] Installing cuda-nvrtc-13-0-0: 100% | 212.5 MiB/s | 217.4 MiB |  00m01s
[ 88/117] Installing cuda-nvrtc-devel-1 100% | 239.7 MiB/s | 244.5 MiB |  00m01s
[ 89/117] Installing cuda-nvrtc-12-9-0: 100% | 209.7 MiB/s | 216.9 MiB |  00m01s
[ 90/117] Installing cuda-nvrtc-devel-1 100% | 168.6 MiB/s | 248.0 MiB |  00m01s
[ 91/117] Installing cuda-nvvm-12-9-0:1 100% | 168.6 MiB/s | 132.7 MiB |  00m01s
[ 92/117] Installing cuda-crt-12-9-0:12 100% | 130.3 MiB/s | 933.9 KiB |  00m00s
[ 93/117] Installing cuda-nvcc-12-9-0:1 100% | 180.0 MiB/s | 317.8 MiB |  00m02s
[ 94/117] Installing emacs-filesystem-1 100% | 177.1 KiB/s | 544.0   B |  00m00s
[ 95/117] Installing cuda-profiler-api- 100% |  38.6 MiB/s |  79.1 KiB |  00m00s
[ 96/117] Installing cuda-driver-devel- 100% |  44.6 MiB/s | 137.0 KiB |  00m00s
[ 97/117] Installing cuda-profiler-api- 100% |  24.4 MiB/s |  74.9 KiB |  00m00s
[ 98/117] Installing cuda-driver-devel- 100% |  64.8 MiB/s | 132.8 KiB |  00m00s
[ 99/117] Installing cuda-nvprune-13-0- 100% |  59.3 MiB/s | 182.1 KiB |  00m00s
[100/117] Installing cuda-cuxxfilt-13-0 100% | 131.1 MiB/s |   1.0 MiB |  00m00s
[101/117] Installing cuda-cuobjdump-13- 100% | 104.8 MiB/s | 751.3 KiB |  00m00s
[102/117] Installing cuda-nvprune-12-9- 100% |  88.8 MiB/s | 181.8 KiB |  00m00s
[103/117] Installing cuda-cuxxfilt-12-9 100% | 116.1 MiB/s |   1.0 MiB |  00m00s
[104/117] Installing cuda-cuobjdump-12- 100% |  81.4 MiB/s | 666.6 KiB |  00m00s
[105/117] Installing rhash-0:1.4.5-2.fc 100% |  13.9 MiB/s | 356.4 KiB |  00m00s
[106/117] Installing jsoncpp-0:1.9.6-1. 100% |  21.4 MiB/s | 263.1 KiB |  00m00s
[107/117] Installing cmake-data-0:3.31. 100% |  48.5 MiB/s |   9.1 MiB |  00m00s
[108/117] Installing cmake-0:3.31.6-2.f 100% | 263.3 MiB/s |  34.2 MiB |  00m00s
[109/117] Installing cuda-compiler-12-9 100% |   0.0   B/s | 124.0   B |  00m00s
[110/117] Installing cuda-compiler-13-0 100% |   0.0   B/s | 124.0   B |  00m00s
[111/117] Installing cuda-libraries-dev 100% |   0.0   B/s | 124.0   B |  00m00s
[112/117] Installing cuda-libraries-dev 100% |  60.5 KiB/s | 124.0   B |  00m00s
[113/117] Installing gcc14-c++-0:14.2.1 100% | 252.3 MiB/s |  59.8 MiB |  00m00s
[114/117] Installing gcc-plugin-annobin 100% |   2.1 MiB/s |  58.6 KiB |  00m00s
[115/117] Installing annobin-plugin-gcc 100% |  31.3 MiB/s | 995.1 KiB |  00m00s
[116/117] Installing cuda-nvml-devel-13 100% | 177.6 MiB/s |   1.4 MiB |  00m00s
[117/117] Installing cuda-nvml-devel-12 100% |   5.1 MiB/s |   1.4 MiB |  00m00s
Warning: skipped OpenPGP checks for 85 packages from repositories: https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64, https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64
Complete!
Finish: build setup for ollama-ggml-cuda-0.12.3-1.fc42.src.rpm
Start: rpmbuild ollama-ggml-cuda-0.12.3-1.fc42.src.rpm
Building target platforms: x86_64
Building for target x86_64
setting SOURCE_DATE_EPOCH=1759363200
Executing(%mkbuilddir): /bin/sh -e /var/tmp/rpm-tmp.M94KKc
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.9bqWEW
+ umask 022
+ cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ rm -rf ollama-0.12.3
+ /usr/lib/rpm/rpmuncompress -x /builddir/build/SOURCES/v0.12.3.tar.gz
+ STATUS=0
+ '[' 0 -ne 0 ']'
+ cd ollama-0.12.3
+ /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w .
+ /usr/lib/rpm/rpmuncompress /builddir/build/SOURCES/remove-runtime-for-cuda-and-rocm.patch
+ /usr/bin/patch -p1 -s --fuzz=0 --no-backup-if-mismatch -f
+ /usr/lib/rpm/rpmuncompress /builddir/build/SOURCES/replace-library-paths.patch
+ /usr/bin/patch -p1 -s --fuzz=0 --no-backup-if-mismatch -f
+ cp -a /usr/local/cuda-12/ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/
+ patch -p1 -d /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/
patching file include/crt/math_functions.h
Hunk #1 succeeded at 2553 with fuzz 1.
Hunk #2 succeeded at 2576 with fuzz 1.
Hunk #3 succeeded at 2598 with fuzz 1.
patch unexpectedly ends in middle of line
Hunk #4 succeeded at 2620 with fuzz 1.
+ RPM_EC=0
++ jobs -p
+ exit 0
Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.ey80Lu
+ umask 022
+ cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer '
+ export CFLAGS
+ CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer '
+ export CXXFLAGS
+ FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules '
+ export FFLAGS
+ FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules '
+ export FCFLAGS
+ VALAFLAGS=-g
+ export VALAFLAGS
+ RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn'
+ export RUSTFLAGS
+ LDFLAGS='-Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes '
+ export LDFLAGS
+ LT_SYS_LIBRARY_PATH=/usr/lib64:
+ export LT_SYS_LIBRARY_PATH
+ CC=gcc
+ export CC
+ CXX=g++
+ export CXX
+ cd ollama-0.12.3
+ CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer '
+ export CFLAGS
+ CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer '
+ export CXXFLAGS
+ FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules '
+ export FFLAGS
+ FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules '
+ export FCFLAGS
+ VALAFLAGS=-g
+ export VALAFLAGS
+ RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn'
+ export RUSTFLAGS
+ LDFLAGS='-Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes '
+ export LDFLAGS
+ LT_SYS_LIBRARY_PATH=/usr/lib64:
+ export LT_SYS_LIBRARY_PATH
+ CC=gcc
+ export CC
+ CXX=g++
+ export CXX
+ /usr/bin/cmake -S . -B redhat-linux-build_cuda-13 -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DCMAKE_INSTALL_FULL_SBINDIR:PATH=/usr/bin -DCMAKE_INSTALL_SBINDIR:PATH=bin -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON --preset 'CUDA 13' -DOLLAMA_RUNNER_DIR=cuda_v13 -DCMAKE_CUDA_COMPILER=/usr/local/cuda-13/bin/nvcc -DCMAKE_CUDA_FLAGS_RELEASE=-DNDEBUG '-DCMAKE_CUDA_FLAGS=-O2 -g -Xcompiler "-fPIC"'
Preset CMake variables:

  CMAKE_BUILD_TYPE="Release"
  CMAKE_CUDA_ARCHITECTURES="75-virtual;80-virtual;86-virtual;87-virtual;89-virtual;90-virtual;90a-virtual;100-virtual;110-virtual;120-virtual;121-virtual"
  CMAKE_MSVC_RUNTIME_LIBRARY="MultiThreaded"

-- The C compiler identification is GNU 15.2.1
-- The CXX compiler identification is GNU 15.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- GGML_SYSTEM_ARCH: x86
-- Including CPU backend
-- x86 detected
-- Adding CPU backend variant ggml-cpu-x64:  
-- x86 detected
-- Adding CPU backend variant ggml-cpu-sse42: -msse4.2 GGML_SSE42
-- x86 detected
-- Adding CPU backend variant ggml-cpu-sandybridge: -msse4.2;-mavx GGML_SSE42;GGML_AVX
-- x86 detected
-- Adding CPU backend variant ggml-cpu-haswell: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2 GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2
-- x86 detected
-- Adding CPU backend variant ggml-cpu-skylakex: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512
-- x86 detected
-- Adding CPU backend variant ggml-cpu-icelake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw;-mavx512vbmi;-mavx512vnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512;GGML_AVX512_VBMI;GGML_AVX512_VNNI
-- x86 detected
-- Adding CPU backend variant ggml-cpu-alderlake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavxvnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX_VNNI
-- Found CUDAToolkit: /usr/local/cuda-13/targets/x86_64-linux/include (found version "13.0.88")
-- CUDA Toolkit found
-- Using CUDA architectures: 75-virtual;80-virtual;86-virtual;87-virtual;89-virtual;90-virtual;90a-virtual;100-virtual;110-virtual;120-virtual;121-virtual
-- The CUDA compiler identification is NVIDIA 13.0.88 with host compiler GNU 15.2.1
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda-13/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Looking for a HIP compiler
-- Looking for a HIP compiler - NOTFOUND
-- Configuring done (8.7s)
-- Generating done (0.1s)
CMake Warning:
  Manually-specified variables were not used by the project:

    CMAKE_Fortran_FLAGS_RELEASE
    CMAKE_INSTALL_DO_STRIP
    INCLUDE_INSTALL_DIR
    LIB_SUFFIX
    SHARE_INSTALL_PREFIX
    SYSCONF_INSTALL_DIR


-- Build files have been written to: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13
+ /usr/bin/cmake --build redhat-linux-build_cuda-13 -j2 --verbose --target ggml-cuda
Change Dir: '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'

Run Build Command(s): /usr/bin/cmake -E env VERBOSE=1 /usr/bin/gmake -f Makefile -j2 ggml-cuda
/usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/gmake  -f CMakeFiles/Makefile2 ggml-cuda
gmake[1]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
/usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/CMakeFiles 47
/usr/bin/gmake  -f CMakeFiles/Makefile2 ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/all
gmake[2]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/depend
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/DependInfo.cmake "--color="
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
[  0%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o
[  2%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o -MF CMakeFiles/ggml-base.dir/ggml.c.o.d -o CMakeFiles/ggml-base.dir/ggml.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o -MF CMakeFiles/ggml-base.dir/ggml.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5663:13: warning: ‘ggml_hash_map_free’ defined but not used [-Wunused-function]
 5663 | static void ggml_hash_map_free(struct hash_map * map) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5656:26: warning: ‘ggml_new_hash_map’ defined but not used [-Wunused-function]
 5656 | static struct hash_map * ggml_new_hash_map(size_t size) {
      |                          ^~~~~~~~~~~~~~~~~
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp:1:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
[  4%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o -MF CMakeFiles/ggml-base.dir/ggml-alloc.c.o.d -o CMakeFiles/ggml-base.dir/ggml-alloc.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c:4:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
[  4%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-backend.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp:14:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
[  6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-opt.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp
[  6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-threading.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-threading.cpp
[  6%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o -MF CMakeFiles/ggml-base.dir/ggml-quants.c.o.d -o CMakeFiles/ggml-base.dir/ggml-quants.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp:6:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:4067:12: warning: ‘iq1_find_best_neighbour’ defined but not used [-Wunused-function]
 4067 | static int iq1_find_best_neighbour(const uint16_t * GGML_RESTRICT neighbours, const uint64_t * GGML_RESTRICT grid,
      |            ^~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:579:14: warning: ‘make_qkx1_quants’ defined but not used [-Wunused-function]
  579 | static float make_qkx1_quants(int n, int nmax, const float * GGML_RESTRICT x, uint8_t * GGML_RESTRICT L, float * GGML_RESTRICT the_min,
      |              ^~~~~~~~~~~~~~~~
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:5:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
[  8%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o -MF CMakeFiles/ggml-base.dir/gguf.cpp.o.d -o CMakeFiles/ggml-base.dir/gguf.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp:3:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
[  8%] Linking CXX shared library ../../../../../lib/ollama/libggml-base.so
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-base.dir/link.txt --verbose=1
/usr/bin/g++ -fPIC -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -Wl,--dependency-file=CMakeFiles/ggml-base.dir/link.d -Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared -Wl,-soname,libggml-base.so -o ../../../../../lib/ollama/libggml-base.so "CMakeFiles/ggml-base.dir/ggml.c.o" "CMakeFiles/ggml-base.dir/ggml.cpp.o" "CMakeFiles/ggml-base.dir/ggml-alloc.c.o" "CMakeFiles/ggml-base.dir/ggml-backend.cpp.o" "CMakeFiles/ggml-base.dir/ggml-opt.cpp.o" "CMakeFiles/ggml-base.dir/ggml-threading.cpp.o" "CMakeFiles/ggml-base.dir/ggml-quants.c.o" "CMakeFiles/ggml-base.dir/gguf.cpp.o"  -lm
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
[  8%] Built target ggml-base
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/depend
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/DependInfo.cmake "--color="
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
[  8%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o
[  8%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o -MF CMakeFiles/ggml-cuda.dir/acc.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/acc.cu -o CMakeFiles/ggml-cuda.dir/acc.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o -MF CMakeFiles/ggml-cuda.dir/add-id.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/add-id.cu -o CMakeFiles/ggml-cuda.dir/add-id.cu.o
[ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o -MF CMakeFiles/ggml-cuda.dir/arange.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/arange.cu -o CMakeFiles/ggml-cuda.dir/arange.cu.o
[ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o -MF CMakeFiles/ggml-cuda.dir/argmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argmax.cu -o CMakeFiles/ggml-cuda.dir/argmax.cu.o
[ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o -MF CMakeFiles/ggml-cuda.dir/argsort.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argsort.cu -o CMakeFiles/ggml-cuda.dir/argsort.cu.o
[ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o -MF CMakeFiles/ggml-cuda.dir/binbcast.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/binbcast.cu -o CMakeFiles/ggml-cuda.dir/binbcast.cu.o
[ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o -MF CMakeFiles/ggml-cuda.dir/clamp.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/clamp.cu -o CMakeFiles/ggml-cuda.dir/clamp.cu.o
[ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o -MF CMakeFiles/ggml-cuda.dir/concat.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/concat.cu -o CMakeFiles/ggml-cuda.dir/concat.cu.o
[ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o -MF CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv-transpose-1d.cu -o CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o
[ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-dw.cu -o CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o
[ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-transpose.cu -o CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o
[ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o -MF CMakeFiles/ggml-cuda.dir/convert.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/convert.cu -o CMakeFiles/ggml-cuda.dir/convert.cu.o
[ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o -MF CMakeFiles/ggml-cuda.dir/count-equal.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/count-equal.cu -o CMakeFiles/ggml-cuda.dir/count-equal.cu.o
[ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o -MF CMakeFiles/ggml-cuda.dir/cpy.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cpy.cu -o CMakeFiles/ggml-cuda.dir/cpy.cu.o
[ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o -MF CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cross-entropy-loss.cu -o CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o
[ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o -MF CMakeFiles/ggml-cuda.dir/diagmask.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/diagmask.cu -o CMakeFiles/ggml-cuda.dir/diagmask.cu.o
[ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o
[ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f32.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o
[ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-wmma-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o
[ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn.cu -o CMakeFiles/ggml-cuda.dir/fattn.cu.o
[ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o -MF CMakeFiles/ggml-cuda.dir/getrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/getrows.cu -o CMakeFiles/ggml-cuda.dir/getrows.cu.o
[ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o -MF CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu -o CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o
[ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o -MF CMakeFiles/ggml-cuda.dir/gla.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/gla.cu -o CMakeFiles/ggml-cuda.dir/gla.cu.o
[ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o -MF CMakeFiles/ggml-cuda.dir/im2col.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/im2col.cu -o CMakeFiles/ggml-cuda.dir/im2col.cu.o
[ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o -MF CMakeFiles/ggml-cuda.dir/mean.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mean.cu -o CMakeFiles/ggml-cuda.dir/mean.cu.o
[ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmf.cu -o CMakeFiles/ggml-cuda.dir/mmf.cu.o
[ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmq.cu -o CMakeFiles/ggml-cuda.dir/mmq.cu.o
[ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvf.cu -o CMakeFiles/ggml-cuda.dir/mmvf.cu.o
[ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvq.cu -o CMakeFiles/ggml-cuda.dir/mmvq.cu.o
[ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o -MF CMakeFiles/ggml-cuda.dir/norm.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/norm.cu -o CMakeFiles/ggml-cuda.dir/norm.cu.o
[ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o -MF CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/opt-step-adamw.cu -o CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o
[ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o -MF CMakeFiles/ggml-cuda.dir/out-prod.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/out-prod.cu -o CMakeFiles/ggml-cuda.dir/out-prod.cu.o
[ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o -MF CMakeFiles/ggml-cuda.dir/pad.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pad.cu -o CMakeFiles/ggml-cuda.dir/pad.cu.o
[ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o -MF CMakeFiles/ggml-cuda.dir/pool2d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pool2d.cu -o CMakeFiles/ggml-cuda.dir/pool2d.cu.o
[ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o -MF CMakeFiles/ggml-cuda.dir/quantize.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/quantize.cu -o CMakeFiles/ggml-cuda.dir/quantize.cu.o
[ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o -MF CMakeFiles/ggml-cuda.dir/roll.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/roll.cu -o CMakeFiles/ggml-cuda.dir/roll.cu.o
[ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o -MF CMakeFiles/ggml-cuda.dir/rope.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/rope.cu -o CMakeFiles/ggml-cuda.dir/rope.cu.o
[ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o -MF CMakeFiles/ggml-cuda.dir/scale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/scale.cu -o CMakeFiles/ggml-cuda.dir/scale.cu.o
[ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o -MF CMakeFiles/ggml-cuda.dir/set-rows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/set-rows.cu -o CMakeFiles/ggml-cuda.dir/set-rows.cu.o
[ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o -MF CMakeFiles/ggml-cuda.dir/softcap.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softcap.cu -o CMakeFiles/ggml-cuda.dir/softcap.cu.o
[ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o -MF CMakeFiles/ggml-cuda.dir/softmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softmax.cu -o CMakeFiles/ggml-cuda.dir/softmax.cu.o
[ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-conv.cu -o CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o
[ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-scan.cu -o CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o
[ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o -MF CMakeFiles/ggml-cuda.dir/sum.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sum.cu -o CMakeFiles/ggml-cuda.dir/sum.cu.o
[ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o -MF CMakeFiles/ggml-cuda.dir/sumrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sumrows.cu -o CMakeFiles/ggml-cuda.dir/sumrows.cu.o
[ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o -MF CMakeFiles/ggml-cuda.dir/tsembd.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/tsembd.cu -o CMakeFiles/ggml-cuda.dir/tsembd.cu.o
[ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o -MF CMakeFiles/ggml-cuda.dir/unary.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/unary.cu -o CMakeFiles/ggml-cuda.dir/unary.cu.o
[ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o -MF CMakeFiles/ggml-cuda.dir/upscale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/upscale.cu -o CMakeFiles/ggml-cuda.dir/upscale.cu.o
[ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o -MF CMakeFiles/ggml-cuda.dir/wkv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/wkv.cu -o CMakeFiles/ggml-cuda.dir/wkv.cu.o
[ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o
[ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
[ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o
[ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
[ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o
[ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o
[ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
[ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
[ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o
[ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o
[ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o
[ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o
[ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o
[ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o
[ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o
[ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o
[ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
[ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o
[ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o
[ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq1_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o
[ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o
[ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o
[ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o
[ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o
[ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o
[ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_nl.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o
[ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o
[ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-mxfp4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o
[ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q2_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o
[ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q3_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o
[ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o
[ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o
[ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o
[ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o
[ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o
[ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o
[ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q6_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o
[ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o
[ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o
[ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o
[ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o
[ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o
[ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o
[ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o
[ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
[ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o
[ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o
[ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/local/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o
[100%] Linking CUDA shared module ../../../../../../lib/ollama/libggml-cuda.so
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-cuda.dir/link.txt --verbose=1
/usr/bin/g++ -fPIC -Wl,--dependency-file=CMakeFiles/ggml-cuda.dir/link.d -Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared  -o ../../../../../../lib/ollama/libggml-cuda.so @CMakeFiles/ggml-cuda.dir/objects1.rsp @CMakeFiles/ggml-cuda.dir/linkLibs.rsp -L"/usr/local/cuda-13/targets/x86_64-linux/lib/stubs" -L"/usr/local/cuda-13/targets/x86_64-linux/lib"
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
[100%] Built target ggml-cuda
gmake[2]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
/usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/CMakeFiles 0
gmake[1]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'

+ CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer '
+ export CFLAGS
+ CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer '
+ export CXXFLAGS
+ FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules '
+ export FFLAGS
+ FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules '
+ export FCFLAGS
+ VALAFLAGS=-g
+ export VALAFLAGS
+ RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn'
+ export RUSTFLAGS
+ LDFLAGS='-Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes '
+ export LDFLAGS
+ LT_SYS_LIBRARY_PATH=/usr/lib64:
+ export LT_SYS_LIBRARY_PATH
+ CC=gcc
+ export CC
+ CXX=g++
+ export CXX
+ /usr/bin/cmake -S . -B redhat-linux-build_cuda-12 -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DCMAKE_INSTALL_FULL_SBINDIR:PATH=/usr/bin -DCMAKE_INSTALL_SBINDIR:PATH=bin -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON --preset 'CUDA 12' -DOLLAMA_RUNNER_DIR=cuda_v12 -DCMAKE_CUDA_COMPILER=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -DCMAKE_CUDA_HOST_COMPILER=g++-14 -DCMAKE_CUDA_FLAGS_RELEASE=-DNDEBUG '-DCMAKE_CUDA_FLAGS=-O2 -g -Xcompiler "-fPIC"'
Preset CMake variables:

  CMAKE_BUILD_TYPE="Release"
  CMAKE_CUDA_ARCHITECTURES="50;60;61;70;75;80;86;87;89;90;90a;120"
  CMAKE_MSVC_RUNTIME_LIBRARY="MultiThreaded"

-- The C compiler identification is GNU 15.2.1
-- The CXX compiler identification is GNU 15.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- GGML_SYSTEM_ARCH: x86
-- Including CPU backend
-- x86 detected
-- Adding CPU backend variant ggml-cpu-x64:  
-- x86 detected
-- Adding CPU backend variant ggml-cpu-sse42: -msse4.2 GGML_SSE42
-- x86 detected
-- Adding CPU backend variant ggml-cpu-sandybridge: -msse4.2;-mavx GGML_SSE42;GGML_AVX
-- x86 detected
-- Adding CPU backend variant ggml-cpu-haswell: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2 GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2
-- x86 detected
-- Adding CPU backend variant ggml-cpu-skylakex: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512
-- x86 detected
-- Adding CPU backend variant ggml-cpu-icelake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw;-mavx512vbmi;-mavx512vnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512;GGML_AVX512_VBMI;GGML_AVX512_VNNI
-- x86 detected
-- Adding CPU backend variant ggml-cpu-alderlake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavxvnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX_VNNI
-- Found CUDAToolkit: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/include (found version "12.9.86")
-- CUDA Toolkit found
-- Using CUDA architectures: 50;60;61;70;75;80;86;87;89;90;90a;120
-- The CUDA compiler identification is NVIDIA 12.9.86 with host compiler GNU 14.2.1
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Looking for a HIP compiler
-- Looking for a HIP compiler - NOTFOUND
-- Configuring done (8.3s)
-- Generating done (0.1s)
CMake Warning:
  Manually-specified variables were not used by the project:

    CMAKE_Fortran_FLAGS_RELEASE
    CMAKE_INSTALL_DO_STRIP
    INCLUDE_INSTALL_DIR
    LIB_SUFFIX
    SHARE_INSTALL_PREFIX
    SYSCONF_INSTALL_DIR


-- Build files have been written to: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12
+ /usr/bin/cmake --build redhat-linux-build_cuda-12 -j2 --verbose --target ggml-cuda
Change Dir: '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'

Run Build Command(s): /usr/bin/cmake -E env VERBOSE=1 /usr/bin/gmake -f Makefile -j2 ggml-cuda
/usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/gmake  -f CMakeFiles/Makefile2 ggml-cuda
gmake[1]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
/usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/CMakeFiles 47
/usr/bin/gmake  -f CMakeFiles/Makefile2 ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/all
gmake[2]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/depend
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/DependInfo.cmake "--color="
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
[  2%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o -MF CMakeFiles/ggml-base.dir/ggml.c.o.d -o CMakeFiles/ggml-base.dir/ggml.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c
[  2%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o -MF CMakeFiles/ggml-base.dir/ggml.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5663:13: warning: ‘ggml_hash_map_free’ defined but not used [-Wunused-function]
 5663 | static void ggml_hash_map_free(struct hash_map * map) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5656:26: warning: ‘ggml_new_hash_map’ defined but not used [-Wunused-function]
 5656 | static struct hash_map * ggml_new_hash_map(size_t size) {
      |                          ^~~~~~~~~~~~~~~~~
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp:1:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
[  4%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o -MF CMakeFiles/ggml-base.dir/ggml-alloc.c.o.d -o CMakeFiles/ggml-base.dir/ggml-alloc.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c:4:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
[  4%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-backend.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp:14:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
[  6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-opt.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp
[  6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-threading.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-threading.cpp
[  6%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o -MF CMakeFiles/ggml-base.dir/ggml-quants.c.o.d -o CMakeFiles/ggml-base.dir/ggml-quants.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp:6:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:4067:12: warning: ‘iq1_find_best_neighbour’ defined but not used [-Wunused-function]
 4067 | static int iq1_find_best_neighbour(const uint16_t * GGML_RESTRICT neighbours, const uint64_t * GGML_RESTRICT grid,
      |            ^~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:579:14: warning: ‘make_qkx1_quants’ defined but not used [-Wunused-function]
  579 | static float make_qkx1_quants(int n, int nmax, const float * GGML_RESTRICT x, uint8_t * GGML_RESTRICT L, float * GGML_RESTRICT the_min,
      |              ^~~~~~~~~~~~~~~~
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:5:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
[  8%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o -MF CMakeFiles/ggml-base.dir/gguf.cpp.o.d -o CMakeFiles/ggml-base.dir/gguf.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp:3:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
[  8%] Linking CXX shared library ../../../../../lib/ollama/libggml-base.so
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-base.dir/link.txt --verbose=1
/usr/bin/g++ -fPIC -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -Wl,--dependency-file=CMakeFiles/ggml-base.dir/link.d -Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared -Wl,-soname,libggml-base.so -o ../../../../../lib/ollama/libggml-base.so "CMakeFiles/ggml-base.dir/ggml.c.o" "CMakeFiles/ggml-base.dir/ggml.cpp.o" "CMakeFiles/ggml-base.dir/ggml-alloc.c.o" "CMakeFiles/ggml-base.dir/ggml-backend.cpp.o" "CMakeFiles/ggml-base.dir/ggml-opt.cpp.o" "CMakeFiles/ggml-base.dir/ggml-threading.cpp.o" "CMakeFiles/ggml-base.dir/ggml-quants.c.o" "CMakeFiles/ggml-base.dir/gguf.cpp.o"  -lm
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
[  8%] Built target ggml-base
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/depend
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/DependInfo.cmake "--color="
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
[  8%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o
[  8%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o -MF CMakeFiles/ggml-cuda.dir/acc.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/acc.cu -o CMakeFiles/ggml-cuda.dir/acc.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o -MF CMakeFiles/ggml-cuda.dir/add-id.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/add-id.cu -o CMakeFiles/ggml-cuda.dir/add-id.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o -MF CMakeFiles/ggml-cuda.dir/arange.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/arange.cu -o CMakeFiles/ggml-cuda.dir/arange.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o -MF CMakeFiles/ggml-cuda.dir/argmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argmax.cu -o CMakeFiles/ggml-cuda.dir/argmax.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o -MF CMakeFiles/ggml-cuda.dir/argsort.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argsort.cu -o CMakeFiles/ggml-cuda.dir/argsort.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o -MF CMakeFiles/ggml-cuda.dir/binbcast.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/binbcast.cu -o CMakeFiles/ggml-cuda.dir/binbcast.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o -MF CMakeFiles/ggml-cuda.dir/clamp.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/clamp.cu -o CMakeFiles/ggml-cuda.dir/clamp.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o -MF CMakeFiles/ggml-cuda.dir/concat.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/concat.cu -o CMakeFiles/ggml-cuda.dir/concat.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o -MF CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv-transpose-1d.cu -o CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-dw.cu -o CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-transpose.cu -o CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o -MF CMakeFiles/ggml-cuda.dir/convert.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/convert.cu -o CMakeFiles/ggml-cuda.dir/convert.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o -MF CMakeFiles/ggml-cuda.dir/count-equal.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/count-equal.cu -o CMakeFiles/ggml-cuda.dir/count-equal.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o -MF CMakeFiles/ggml-cuda.dir/cpy.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cpy.cu -o CMakeFiles/ggml-cuda.dir/cpy.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o -MF CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cross-entropy-loss.cu -o CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o -MF CMakeFiles/ggml-cuda.dir/diagmask.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/diagmask.cu -o CMakeFiles/ggml-cuda.dir/diagmask.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f32.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-wmma-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn.cu -o CMakeFiles/ggml-cuda.dir/fattn.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o -MF CMakeFiles/ggml-cuda.dir/getrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/getrows.cu -o CMakeFiles/ggml-cuda.dir/getrows.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o -MF CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu -o CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o -MF CMakeFiles/ggml-cuda.dir/gla.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/gla.cu -o CMakeFiles/ggml-cuda.dir/gla.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o -MF CMakeFiles/ggml-cuda.dir/im2col.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/im2col.cu -o CMakeFiles/ggml-cuda.dir/im2col.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o -MF CMakeFiles/ggml-cuda.dir/mean.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mean.cu -o CMakeFiles/ggml-cuda.dir/mean.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmf.cu -o CMakeFiles/ggml-cuda.dir/mmf.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmq.cu -o CMakeFiles/ggml-cuda.dir/mmq.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvf.cu -o CMakeFiles/ggml-cuda.dir/mmvf.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvq.cu -o CMakeFiles/ggml-cuda.dir/mmvq.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o -MF CMakeFiles/ggml-cuda.dir/norm.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/norm.cu -o CMakeFiles/ggml-cuda.dir/norm.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o -MF CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/opt-step-adamw.cu -o CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o -MF CMakeFiles/ggml-cuda.dir/out-prod.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/out-prod.cu -o CMakeFiles/ggml-cuda.dir/out-prod.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o -MF CMakeFiles/ggml-cuda.dir/pad.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pad.cu -o CMakeFiles/ggml-cuda.dir/pad.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o -MF CMakeFiles/ggml-cuda.dir/pool2d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pool2d.cu -o CMakeFiles/ggml-cuda.dir/pool2d.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o -MF CMakeFiles/ggml-cuda.dir/quantize.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/quantize.cu -o CMakeFiles/ggml-cuda.dir/quantize.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o -MF CMakeFiles/ggml-cuda.dir/roll.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/roll.cu -o CMakeFiles/ggml-cuda.dir/roll.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o -MF CMakeFiles/ggml-cuda.dir/rope.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/rope.cu -o CMakeFiles/ggml-cuda.dir/rope.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o -MF CMakeFiles/ggml-cuda.dir/scale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/scale.cu -o CMakeFiles/ggml-cuda.dir/scale.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o -MF CMakeFiles/ggml-cuda.dir/set-rows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/set-rows.cu -o CMakeFiles/ggml-cuda.dir/set-rows.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o -MF CMakeFiles/ggml-cuda.dir/softcap.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softcap.cu -o CMakeFiles/ggml-cuda.dir/softcap.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o -MF CMakeFiles/ggml-cuda.dir/softmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softmax.cu -o CMakeFiles/ggml-cuda.dir/softmax.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-conv.cu -o CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-scan.cu -o CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o -MF CMakeFiles/ggml-cuda.dir/sum.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sum.cu -o CMakeFiles/ggml-cuda.dir/sum.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o -MF CMakeFiles/ggml-cuda.dir/sumrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sumrows.cu -o CMakeFiles/ggml-cuda.dir/sumrows.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o -MF CMakeFiles/ggml-cuda.dir/tsembd.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/tsembd.cu -o CMakeFiles/ggml-cuda.dir/tsembd.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o -MF CMakeFiles/ggml-cuda.dir/unary.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/unary.cu -o CMakeFiles/ggml-cuda.dir/unary.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o -MF CMakeFiles/ggml-cuda.dir/upscale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/upscale.cu -o CMakeFiles/ggml-cuda.dir/upscale.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o -MF CMakeFiles/ggml-cuda.dir/wkv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/wkv.cu -o CMakeFiles/ggml-cuda.dir/wkv.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq1_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_nl.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-mxfp4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q2_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q3_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q6_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[100%] Linking CUDA shared module ../../../../../../lib/ollama/libggml-cuda.so
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-cuda.dir/link.txt --verbose=1
/usr/bin/g++-14 -fPIC -Wl,--dependency-file=CMakeFiles/ggml-cuda.dir/link.d -Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared  -o ../../../../../../lib/ollama/libggml-cuda.so @CMakeFiles/ggml-cuda.dir/objects1.rsp @CMakeFiles/ggml-cuda.dir/linkLibs.rsp -L"/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/lib/stubs" -L"/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/lib"
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
[100%] Built target ggml-cuda
gmake[2]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
/usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/CMakeFiles 0
gmake[1]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'

+ RPM_EC=0
++ jobs -p
+ exit 0
Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.3nX7KD
+ umask 022
+ cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ '[' /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT '!=' / ']'
+ rm -rf /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT
++ dirname /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT
+ mkdir -p /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ mkdir /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT
+ CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer '
+ export CFLAGS
+ CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer '
+ export CXXFLAGS
+ FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules '
+ export FFLAGS
+ FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules '
+ export FCFLAGS
+ VALAFLAGS=-g
+ export VALAFLAGS
+ RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn'
+ export RUSTFLAGS
+ LDFLAGS='-Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes '
+ export LDFLAGS
+ LT_SYS_LIBRARY_PATH=/usr/lib64:
+ export LT_SYS_LIBRARY_PATH
+ CC=gcc
+ export CC
+ CXX=g++
+ export CXX
+ cd ollama-0.12.3
+ DESTDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT
+ /usr/bin/cmake --install redhat-linux-build_cuda-13 --component CUDA
-- Install configuration: "Release"
-- Installing: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v13/libggml-cuda.so
-- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v13/libggml-cuda.so" to ""
+ DESTDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT
+ /usr/bin/cmake --install redhat-linux-build_cuda-12 --component CUDA
-- Install configuration: "Release"
-- Installing: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v12/libggml-cuda.so
-- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v12/libggml-cuda.so" to ""
+ /usr/bin/find-debuginfo -j2 --strict-build-id -m -i --build-id-seed 0.12.3-1.fc42 --unique-debug-suffix -0.12.3-1.fc42.x86_64 --unique-debug-src-base ollama-ggml-cuda-0.12.3-1.fc42.x86_64 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 110000000 -S debugsourcefiles.list /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3
find-debuginfo: starting
Extracting debug info from 2 files
DWARF-compressing 2 files
sepdebugcrcfix: Updated 2 CRC32s, 0 CRC32s did match.
Creating .debug symlinks for symlinks to ELF files
Copying sources found by 'debugedit -l' to /usr/src/debug/ollama-ggml-cuda-0.12.3-1.fc42.x86_64
find-debuginfo: done
+ /usr/lib/rpm/check-buildroot
+ /usr/lib/rpm/redhat/brp-ldconfig
+ /usr/lib/rpm/brp-compress
+ /usr/lib/rpm/redhat/brp-strip-lto /usr/bin/strip
+ /usr/lib/rpm/brp-strip-static-archive /usr/bin/strip
+ /usr/lib/rpm/check-rpaths
+ /usr/lib/rpm/redhat/brp-mangle-shebangs
+ /usr/lib/rpm/brp-remove-la-files
+ env /usr/lib/rpm/redhat/brp-python-bytecompile '' 1 0 -j2
+ /usr/lib/rpm/redhat/brp-python-hardlink
+ /usr/bin/add-determinism --brp -j2 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT
Scanned 39 directories and 162 files,
               processed 0 inodes,
               0 modified (0 replaced + 0 rewritten),
               0 unsupported format, 0 errors
Reading /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/SPECPARTS/rpm-debuginfo.specpart
Processing files: ollama-ggml-cuda-13-0.12.3-1.fc42.x86_64
Executing(%license): /bin/sh -e /var/tmp/rpm-tmp.dFmKNX
+ umask 022
+ cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ cd ollama-0.12.3
+ LICENSEDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-13
+ export LC_ALL=C.UTF-8
+ LC_ALL=C.UTF-8
+ export LICENSEDIR
+ /usr/bin/mkdir -p /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-13
+ cp -pr /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/LICENSE /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-13
+ RPM_EC=0
++ jobs -p
+ exit 0
Provides: libggml-cuda.so()(64bit) ollama-ggml-cuda-13 = 0.12.3-1.fc42 ollama-ggml-cuda-13(x86-64) = 0.12.3-1.fc42
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Requires: libc.so.6()(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_ABI_DT_RELR)(64bit) libcublas.so.13()(64bit) libcublas.so.13(libcublas.so.13)(64bit) libcuda.so.1()(64bit) libcudart.so.13()(64bit) libcudart.so.13(libcudart.so.13)(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libm.so.6()(64bit) libm.so.6(GLIBC_2.27)(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.30)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) rtld(GNU_HASH)
Supplements: if libcublas-13-0 ollama-ggml
Processing files: ollama-ggml-cuda-12-0.12.3-1.fc42.x86_64
Executing(%license): /bin/sh -e /var/tmp/rpm-tmp.vKWo2M
+ umask 022
+ cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ cd ollama-0.12.3
+ LICENSEDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-12
+ export LC_ALL=C.UTF-8
+ LC_ALL=C.UTF-8
+ export LICENSEDIR
+ /usr/bin/mkdir -p /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-12
+ cp -pr /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/LICENSE /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-12
+ RPM_EC=0
++ jobs -p
+ exit 0
Provides: libggml-cuda.so()(64bit) ollama-ggml-cuda-12 = 0.12.3-1.fc42 ollama-ggml-cuda-12(x86-64) = 0.12.3-1.fc42
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Requires: libc.so.6()(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_ABI_DT_RELR)(64bit) libcublas.so.12()(64bit) libcublas.so.12(libcublas.so.12)(64bit) libcuda.so.1()(64bit) libcudart.so.12()(64bit) libcudart.so.12(libcudart.so.12)(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libm.so.6()(64bit) libm.so.6(GLIBC_2.27)(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.30)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) rtld(GNU_HASH)
Supplements: if libcublas-12-9 ollama-ggml
Processing files: ollama-ggml-cuda-debugsource-0.12.3-1.fc42.x86_64
Provides: ollama-ggml-cuda-debugsource = 0.12.3-1.fc42 ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc42
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Processing files: ollama-ggml-cuda-debuginfo-0.12.3-1.fc42.x86_64
Provides: ollama-ggml-cuda-debuginfo = 0.12.3-1.fc42 ollama-ggml-cuda-debuginfo(x86-64) = 0.12.3-1.fc42
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Recommends: ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc42
Processing files: ollama-ggml-cuda-13-debuginfo-0.12.3-1.fc42.x86_64
Provides: debuginfo(build-id) = bd6ea8e8019ee241b27fbb38c682387faed925ac libggml-cuda.so-0.12.3-1.fc42.x86_64.debug()(64bit) ollama-ggml-cuda-13-debuginfo = 0.12.3-1.fc42 ollama-ggml-cuda-13-debuginfo(x86-64) = 0.12.3-1.fc42
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Recommends: ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc42
Processing files: ollama-ggml-cuda-12-debuginfo-0.12.3-1.fc42.x86_64
Provides: debuginfo(build-id) = f923ca8326c9ddd427231ebcac67552682233b1d libggml-cuda.so-0.12.3-1.fc42.x86_64.debug()(64bit) ollama-ggml-cuda-12-debuginfo = 0.12.3-1.fc42 ollama-ggml-cuda-12-debuginfo(x86-64) = 0.12.3-1.fc42
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Recommends: ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc42
Checking for unpackaged file(s): /usr/lib/rpm/check-files /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT
Wrote: /builddir/build/RPMS/ollama-ggml-cuda-13-0.12.3-1.fc42.x86_64.rpm
Wrote: /builddir/build/RPMS/ollama-ggml-cuda-13-debuginfo-0.12.3-1.fc42.x86_64.rpm
Wrote: /builddir/build/RPMS/ollama-ggml-cuda-12-debuginfo-0.12.3-1.fc42.x86_64.rpm
Wrote: /builddir/build/RPMS/ollama-ggml-cuda-debugsource-0.12.3-1.fc42.x86_64.rpm
Wrote: /builddir/build/RPMS/ollama-ggml-cuda-debuginfo-0.12.3-1.fc42.x86_64.rpm
Wrote: /builddir/build/RPMS/ollama-ggml-cuda-12-0.12.3-1.fc42.x86_64.rpm
Executing(rmbuild): /bin/sh -e /var/tmp/rpm-tmp.5p4cC3
+ umask 022
+ cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ test -d /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ rm -rf /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ RPM_EC=0
++ jobs -p
+ exit 0
Finish: rpmbuild ollama-ggml-cuda-0.12.3-1.fc42.src.rpm
Finish: build phase for ollama-ggml-cuda-0.12.3-1.fc42.src.rpm
INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan
INFO: /var/lib/mock/fedora-42-x86_64-1759428480.475249/root/var/log/dnf5.log
INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz
/bin/tar: Removing leading `/' from member names
INFO: Done(/var/lib/copr-rpmbuild/results/ollama-ggml-cuda-0.12.3-1.fc42.src.rpm) Config(child) 179 minutes 33 seconds
INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results
INFO: Cleaning up build root ('cleanup_on_success=True')
Start: clean chroot
INFO: unmounting tmpfs.
Finish: clean chroot
Finish: run
Running RPMResults tool
Package info:
{
    "packages": [
        {
            "name": "ollama-ggml-cuda-12",
            "epoch": null,
            "version": "0.12.3",
            "release": "1.fc42",
            "arch": "x86_64"
        },
        {
            "name": "ollama-ggml-cuda-debuginfo",
            "epoch": null,
            "version": "0.12.3",
            "release": "1.fc42",
            "arch": "x86_64"
        },
        {
            "name": "ollama-ggml-cuda",
            "epoch": null,
            "version": "0.12.3",
            "release": "1.fc42",
            "arch": "src"
        },
        {
            "name": "ollama-ggml-cuda-debugsource",
            "epoch": null,
            "version": "0.12.3",
            "release": "1.fc42",
            "arch": "x86_64"
        },
        {
            "name": "ollama-ggml-cuda-13-debuginfo",
            "epoch": null,
            "version": "0.12.3",
            "release": "1.fc42",
            "arch": "x86_64"
        },
        {
            "name": "ollama-ggml-cuda-12-debuginfo",
            "epoch": null,
            "version": "0.12.3",
            "release": "1.fc42",
            "arch": "x86_64"
        },
        {
            "name": "ollama-ggml-cuda-13",
            "epoch": null,
            "version": "0.12.3",
            "release": "1.fc42",
            "arch": "x86_64"
        }
    ]
}
RPMResults finished