Warning: Permanently added '54.162.7.100' (ED25519) to the list of known hosts.

You can reproduce this build on your computer by running:

  sudo dnf install copr-rpmbuild
  /usr/bin/copr-rpmbuild --verbose --drop-resultdir --task-url https://copr.fedorainfracloud.org/backend/get-build-task/9640922-fedora-43-x86_64 --chroot fedora-43-x86_64


Version: 1.6
PID: 8534
Logging PID: 8536
Task:
{'allow_user_ssh': False,
 'appstream': False,
 'background': False,
 'build_id': 9640922,
 'buildroot_pkgs': [],
 'chroot': 'fedora-43-x86_64',
 'enable_net': False,
 'fedora_review': False,
 'git_hash': 'd303daa4126ca907fd5db0a5f5e8d3715a737765',
 'git_repo': 'https://copr-dist-git.fedorainfracloud.org/git/fachep/ollama/ollama-ggml-cuda',
 'isolation': 'default',
 'memory_reqs': 2048,
 'package_name': 'ollama-ggml-cuda',
 'package_version': '0.12.3-1',
 'project_dirname': 'ollama',
 'project_name': 'ollama',
 'project_owner': 'fachep',
 'repo_priority': None,
 'repos': [{'baseurl': 'https://download.copr.fedorainfracloud.org/results/fachep/ollama/fedora-43-x86_64/',
            'id': 'copr_base',
            'name': 'Copr repository',
            'priority': None},
           {'baseurl': 'https://developer.download.nvidia.cn/compute/cuda/repos/fedora42/x86_64/',
            'id': 'https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64',
            'name': 'Additional repo https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64'},
           {'baseurl': 'https://developer.download.nvidia.cn/compute/cuda/repos/fedora41/x86_64/',
            'id': 'https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64',
            'name': 'Additional repo https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64'}],
 'sandbox': 'fachep/ollama--fachep',
 'source_json': {},
 'source_type': None,
 'ssh_public_keys': None,
 'storage': 0,
 'submitter': 'fachep',
 'tags': [],
 'task_id': '9640922-fedora-43-x86_64',
 'timeout': 18000,
 'uses_devel_repo': False,
 'with_opts': [],
 'without_opts': []}

Running: git clone https://copr-dist-git.fedorainfracloud.org/git/fachep/ollama/ollama-ggml-cuda /var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda --depth 500 --no-single-branch --recursive

cmd: ['git', 'clone', 'https://copr-dist-git.fedorainfracloud.org/git/fachep/ollama/ollama-ggml-cuda', '/var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda', '--depth', '500', '--no-single-branch', '--recursive']
cwd: .
rc: 0
stdout: 
stderr: Cloning into '/var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda'...

Running: git checkout d303daa4126ca907fd5db0a5f5e8d3715a737765 --

cmd: ['git', 'checkout', 'd303daa4126ca907fd5db0a5f5e8d3715a737765', '--']
cwd: /var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda
rc: 0
stdout: 
stderr: Note: switching to 'd303daa4126ca907fd5db0a5f5e8d3715a737765'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at d303daa automatic import of ollama-ggml-cuda

Running: dist-git-client sources

cmd: ['dist-git-client', 'sources']
cwd: /var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda
rc: 0
stdout: 
stderr: INFO: Reading stdout from command: git rev-parse --abbrev-ref HEAD
INFO: Reading stdout from command: git rev-parse HEAD
INFO: Reading sources specification file: sources
INFO: Downloading v0.12.3.tar.gz
INFO: Reading stdout from command: curl --help all
INFO: Calling: curl -H Pragma: -o v0.12.3.tar.gz --location --connect-timeout 60 --retry 3 --retry-delay 10 --remote-time --show-error --fail --retry-all-errors https://copr-dist-git.fedorainfracloud.org/repo/pkgs/fachep/ollama/ollama-ggml-cuda/v0.12.3.tar.gz/md5/f096acee5e82596e9afd4d07ed477de2/v0.12.3.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 10.5M  100 10.5M    0     0   163M      0 --:--:-- --:--:-- --:--:--  161M
INFO: Reading stdout from command: md5sum v0.12.3.tar.gz

tail: /var/lib/copr-rpmbuild/main.log: file truncated
Running (timeout=18000): unbuffer mock --spec /var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda/ollama-ggml-cuda.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1759434727.591343 -r /var/lib/copr-rpmbuild/results/configs/child.cfg
INFO: mock.py version 6.3 starting (python version = 3.13.7, NVR = mock-6.3-1.fc42), args: /usr/libexec/mock/mock --spec /var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda/ollama-ggml-cuda.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1759434727.591343 -r /var/lib/copr-rpmbuild/results/configs/child.cfg
Start(bootstrap): init plugins
INFO: tmpfs initialized
INFO: selinux enabled
INFO: chroot_scan: initialized
INFO: compress_logs: initialized
Finish(bootstrap): init plugins
Start: init plugins
INFO: tmpfs initialized
INFO: selinux enabled
INFO: chroot_scan: initialized
INFO: compress_logs: initialized
Finish: init plugins
INFO: Signal handler active
Start: run
INFO: Start(/var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda/ollama-ggml-cuda.spec)  Config(fedora-43-x86_64)
Start: clean chroot
Finish: clean chroot
Mock Version: 6.3
INFO: Mock Version: 6.3
Start(bootstrap): chroot init
INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1759434727.591343/root.
INFO: calling preinit hooks
INFO: enabled root cache
INFO: enabled package manager cache
Start(bootstrap): cleaning package manager metadata
Finish(bootstrap): cleaning package manager metadata
INFO: Guessed host environment type: unknown
INFO: Using container image: registry.fedoraproject.org/fedora:43
INFO: Pulling image: registry.fedoraproject.org/fedora:43
INFO: Tagging container image as mock-bootstrap-2fe5f8da-78b0-4d80-9fdd-54c11f567ce9
INFO: Checking that fbd2b7ac2fe12801f103f414deb42fb99dcc02d30a594720a2ddf5f06c31fecf image matches host's architecture
INFO: Copy content of container fbd2b7ac2fe12801f103f414deb42fb99dcc02d30a594720a2ddf5f06c31fecf to /var/lib/mock/fedora-43-x86_64-bootstrap-1759434727.591343/root
INFO: mounting fbd2b7ac2fe12801f103f414deb42fb99dcc02d30a594720a2ddf5f06c31fecf with podman image mount
INFO: image fbd2b7ac2fe12801f103f414deb42fb99dcc02d30a594720a2ddf5f06c31fecf as /var/lib/containers/storage/overlay/8e4aa573aacb9609a613eaf37ee7a61670ba148dc3acd2855ea41179f18ba5af/merged
INFO: umounting image fbd2b7ac2fe12801f103f414deb42fb99dcc02d30a594720a2ddf5f06c31fecf (/var/lib/containers/storage/overlay/8e4aa573aacb9609a613eaf37ee7a61670ba148dc3acd2855ea41179f18ba5af/merged) with podman image umount
INFO: Removing image mock-bootstrap-2fe5f8da-78b0-4d80-9fdd-54c11f567ce9
INFO: Package manager dnf5 detected and used (fallback)
INFO: Not updating bootstrap chroot, bootstrap_image_ready=True
Start(bootstrap): creating root cache
Finish(bootstrap): creating root cache
Finish(bootstrap): chroot init
Start: chroot init
INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-1759434727.591343/root.
INFO: calling preinit hooks
INFO: enabled root cache
INFO: enabled package manager cache
Start: cleaning package manager metadata
Finish: cleaning package manager metadata
INFO: enabled HW Info plugin
INFO: Package manager dnf5 detected and used (direct choice)
INFO: Buildroot is handled by package management downloaded with a bootstrap image:
  rpm-6.0.0-1.fc43.x86_64
  rpm-sequoia-1.9.0-2.fc43.x86_64
  dnf5-5.2.17.0-2.fc43.x86_64
  dnf5-plugins-5.2.17.0-2.fc43.x86_64
Start: installing minimal buildroot with dnf5
Updating and loading repositories:
 Copr repository                        100% |   5.9 KiB/s |   1.6 KiB |  00m00s
 Additional repo https_developer_downlo 100% |  69.5 KiB/s |  47.8 KiB |  00m01s
 updates                                100% |  34.2 KiB/s |  33.3 KiB |  00m01s
 Additional repo https_developer_downlo 100% | 144.9 KiB/s | 109.0 KiB |  00m01s
 fedora                                 100% |  26.7 MiB/s |  42.2 MiB |  00m02s
Repositories loaded.
Package                            Arch   Version                     Repository      Size
Installing group/module packages:
 bash                              x86_64 5.3.0-2.fc43                fedora       8.4 MiB
 bzip2                             x86_64 1.0.8-21.fc43               fedora      95.3 KiB
 coreutils                         x86_64 9.7-5.fc43                  fedora       5.4 MiB
 cpio                              x86_64 2.15-6.fc43                 fedora       1.1 MiB
 diffutils                         x86_64 3.12-3.fc43                 fedora       1.6 MiB
 fedora-release-common             noarch 43-0.22                     fedora      20.4 KiB
 findutils                         x86_64 1:4.10.0-6.fc43             fedora       1.8 MiB
 gawk                              x86_64 5.3.2-2.fc43                fedora       1.8 MiB
 glibc-minimal-langpack            x86_64 2.42-4.fc43                 fedora       0.0   B
 grep                              x86_64 3.12-2.fc43                 fedora       1.0 MiB
 gzip                              x86_64 1.13-4.fc43                 fedora     388.8 KiB
 info                              x86_64 7.2-6.fc43                  fedora     353.9 KiB
 patch                             x86_64 2.8-2.fc43                  fedora     222.8 KiB
 redhat-rpm-config                 noarch 343-11.fc43                 fedora     182.9 KiB
 rpm-build                         x86_64 6.0.0-1.fc43                fedora     287.4 KiB
 sed                               x86_64 4.9-5.fc43                  fedora     857.3 KiB
 shadow-utils                      x86_64 2:4.18.0-3.fc43             fedora       3.9 MiB
 tar                               x86_64 2:1.35-6.fc43               fedora       2.9 MiB
 unzip                             x86_64 6.0-67.fc43                 fedora     386.3 KiB
 util-linux                        x86_64 2.41.1-16.fc43              fedora       3.5 MiB
 which                             x86_64 2.23-3.fc43                 fedora      83.5 KiB
 xz                                x86_64 1:5.8.1-2.fc43              fedora       1.3 MiB
Installing dependencies:
 add-determinism                   x86_64 0.6.0-2.fc43                fedora       2.4 MiB
 alternatives                      x86_64 1.33-2.fc43                 fedora      62.2 KiB
 ansible-srpm-macros               noarch 1-18.1.fc43                 fedora      35.7 KiB
 audit-libs                        x86_64 4.1.1-2.fc43                fedora     378.8 KiB
 binutils                          x86_64 2.45-1.fc43                 fedora      26.5 MiB
 build-reproducibility-srpm-macros noarch 0.6.0-2.fc43                fedora     735.0   B
 bzip2-libs                        x86_64 1.0.8-21.fc43               fedora      80.6 KiB
 ca-certificates                   noarch 2025.2.80_v9.0.304-1.1.fc43 fedora       2.7 MiB
 coreutils-common                  x86_64 9.7-5.fc43                  fedora      11.3 MiB
 crypto-policies                   noarch 20250714-5.gitcd6043a.fc43  fedora     146.9 KiB
 curl                              x86_64 8.15.0-2.fc43               fedora     473.6 KiB
 cyrus-sasl-lib                    x86_64 2.1.28-33.fc43              fedora       2.3 MiB
 debugedit                         x86_64 5.2-3.fc43                  fedora     214.0 KiB
 dwz                               x86_64 0.16-2.fc43                 fedora     287.1 KiB
 ed                                x86_64 1.22.2-1.fc43               fedora     148.1 KiB
 efi-srpm-macros                   noarch 6-4.fc43                    fedora      40.1 KiB
 elfutils                          x86_64 0.193-3.fc43                fedora       2.9 MiB
 elfutils-debuginfod-client        x86_64 0.193-3.fc43                fedora      83.9 KiB
 elfutils-default-yama-scope       noarch 0.193-3.fc43                fedora       1.8 KiB
 elfutils-libelf                   x86_64 0.193-3.fc43                fedora       1.2 MiB
 elfutils-libs                     x86_64 0.193-3.fc43                fedora     683.4 KiB
 fedora-gpg-keys                   noarch 43-0.4                      fedora     131.2 KiB
 fedora-release                    noarch 43-0.22                     fedora       0.0   B
 fedora-release-identity-basic     noarch 43-0.22                     fedora     658.0   B
 fedora-repos                      noarch 43-0.4                      fedora       4.9 KiB
 file                              x86_64 5.46-8.fc43                 fedora     100.2 KiB
 file-libs                         x86_64 5.46-8.fc43                 fedora      11.9 MiB
 filesystem                        x86_64 3.18-50.fc43                fedora     112.0   B
 filesystem-srpm-macros            noarch 3.18-50.fc43                fedora      38.2 KiB
 fonts-srpm-macros                 noarch 1:2.0.5-23.fc43             fedora      55.8 KiB
 forge-srpm-macros                 noarch 0.4.0-3.fc43                fedora      38.9 KiB
 fpc-srpm-macros                   noarch 1.3-15.fc43                 fedora     144.0   B
 gap-srpm-macros                   noarch 1-1.fc43                    fedora       2.0 KiB
 gdb-minimal                       x86_64 16.3-6.fc43                 fedora      13.3 MiB
 gdbm-libs                         x86_64 1:1.23-10.fc43              fedora     129.9 KiB
 ghc-srpm-macros                   noarch 1.9.2-3.fc43                fedora     779.0   B
 glibc                             x86_64 2.42-4.fc43                 fedora       6.7 MiB
 glibc-common                      x86_64 2.42-4.fc43                 fedora       1.0 MiB
 glibc-gconv-extra                 x86_64 2.42-4.fc43                 fedora       7.2 MiB
 gmp                               x86_64 1:6.3.0-4.fc43              fedora     811.2 KiB
 gnat-srpm-macros                  noarch 6-8.fc43                    fedora       1.0 KiB
 gnupg2                            x86_64 2.4.8-4.fc43                fedora       6.5 MiB
 gnupg2-dirmngr                    x86_64 2.4.8-4.fc43                fedora     618.4 KiB
 gnupg2-gpg-agent                  x86_64 2.4.8-4.fc43                fedora     671.4 KiB
 gnupg2-gpgconf                    x86_64 2.4.8-4.fc43                fedora     250.0 KiB
 gnupg2-keyboxd                    x86_64 2.4.8-4.fc43                fedora     201.4 KiB
 gnupg2-verify                     x86_64 2.4.8-4.fc43                fedora     348.5 KiB
 gnutls                            x86_64 3.8.10-3.fc43               fedora       3.8 MiB
 go-srpm-macros                    noarch 3.8.0-1.fc43                fedora      61.9 KiB
 gpgverify                         noarch 2.2-3.fc43                  fedora       8.7 KiB
 ima-evm-utils-libs                x86_64 1.6.2-6.fc43                fedora      60.7 KiB
 jansson                           x86_64 2.14-3.fc43                 fedora      89.1 KiB
 java-srpm-macros                  noarch 1-7.fc43                    fedora     870.0   B
 json-c                            x86_64 0.18-7.fc43                 fedora      82.7 KiB
 kernel-srpm-macros                noarch 1.0-27.fc43                 fedora       1.9 KiB
 keyutils-libs                     x86_64 1.6.3-6.fc43                fedora      54.3 KiB
 krb5-libs                         x86_64 1.21.3-7.fc43               fedora       2.3 MiB
 libacl                            x86_64 2.3.2-4.fc43                fedora      35.9 KiB
 libarchive                        x86_64 3.8.1-3.fc43                fedora     951.1 KiB
 libassuan                         x86_64 2.5.7-4.fc43                fedora     163.8 KiB
 libattr                           x86_64 2.5.2-6.fc43                fedora      24.4 KiB
 libblkid                          x86_64 2.41.1-16.fc43              fedora     262.4 KiB
 libbrotli                         x86_64 1.1.0-10.fc43               fedora     833.3 KiB
 libcap                            x86_64 2.76-3.fc43                 fedora     209.1 KiB
 libcap-ng                         x86_64 0.8.5-7.fc43                fedora      68.9 KiB
 libcom_err                        x86_64 1.47.3-2.fc43               fedora      63.1 KiB
 libcurl                           x86_64 8.15.0-2.fc43               fedora     903.2 KiB
 libeconf                          x86_64 0.7.9-2.fc43                fedora      64.9 KiB
 libevent                          x86_64 2.1.12-16.fc43              fedora     883.1 KiB
 libfdisk                          x86_64 2.41.1-16.fc43              fedora     380.4 KiB
 libffi                            x86_64 3.5.1-2.fc43                fedora      83.6 KiB
 libfsverity                       x86_64 1.6-3.fc43                  fedora      28.5 KiB
 libgcc                            x86_64 15.2.1-2.fc43               fedora     266.6 KiB
 libgcrypt                         x86_64 1.11.1-2.fc43               fedora       1.6 MiB
 libgomp                           x86_64 15.2.1-2.fc43               fedora     541.1 KiB
 libgpg-error                      x86_64 1.55-2.fc43                 fedora     915.3 KiB
 libidn2                           x86_64 2.3.8-2.fc43                fedora     552.5 KiB
 libksba                           x86_64 1.6.7-4.fc43                fedora     398.5 KiB
 liblastlog2                       x86_64 2.41.1-16.fc43              fedora      33.9 KiB
 libmount                          x86_64 2.41.1-16.fc43              fedora     372.7 KiB
 libnghttp2                        x86_64 1.66.0-2.fc43               fedora     162.2 KiB
 libpkgconf                        x86_64 2.3.0-3.fc43                fedora      78.1 KiB
 libpsl                            x86_64 0.21.5-6.fc43               fedora      76.4 KiB
 libselinux                        x86_64 3.9-5.fc43                  fedora     193.1 KiB
 libsemanage                       x86_64 3.9-4.fc43                  fedora     308.5 KiB
 libsepol                          x86_64 3.9-2.fc43                  fedora     822.0 KiB
 libsmartcols                      x86_64 2.41.1-16.fc43              fedora     180.5 KiB
 libssh                            x86_64 0.11.3-1.fc43               fedora     567.1 KiB
 libssh-config                     noarch 0.11.3-1.fc43               fedora     277.0   B
 libstdc++                         x86_64 15.2.1-2.fc43               fedora       2.8 MiB
 libtasn1                          x86_64 4.20.0-2.fc43               fedora     176.3 KiB
 libtool-ltdl                      x86_64 2.5.4-7.fc43                fedora      70.1 KiB
 libunistring                      x86_64 1.1-10.fc43                 fedora       1.7 MiB
 libusb1                           x86_64 1.0.29-4.fc43               fedora     171.3 KiB
 libuuid                           x86_64 2.41.1-16.fc43              fedora      37.4 KiB
 libverto                          x86_64 0.3.2-11.fc43               fedora      25.4 KiB
 libxcrypt                         x86_64 4.4.38-8.fc43               fedora     284.5 KiB
 libxml2                           x86_64 2.12.10-4.fc43              fedora       1.7 MiB
 libzstd                           x86_64 1.5.7-2.fc43                fedora     799.9 KiB
 lua-libs                          x86_64 5.4.8-2.fc43                fedora     280.8 KiB
 lua-srpm-macros                   noarch 1-16.fc43                   fedora       1.3 KiB
 lz4-libs                          x86_64 1.10.0-3.fc43               fedora     161.4 KiB
 mpfr                              x86_64 4.2.2-2.fc43                fedora     832.8 KiB
 ncurses-base                      noarch 6.5-7.20250614.fc43         fedora     328.1 KiB
 ncurses-libs                      x86_64 6.5-7.20250614.fc43         fedora     946.3 KiB
 nettle                            x86_64 3.10.1-2.fc43               fedora     790.6 KiB
 npth                              x86_64 1.8-3.fc43                  fedora      49.6 KiB
 ocaml-srpm-macros                 noarch 11-2.fc43                   fedora       1.9 KiB
 openblas-srpm-macros              noarch 2-20.fc43                   fedora     112.0   B
 openldap                          x86_64 2.6.10-4.fc43               fedora     659.9 KiB
 openssl-libs                      x86_64 1:3.5.1-2.fc43              fedora       8.9 MiB
 p11-kit                           x86_64 0.25.8-1.fc43               fedora       2.3 MiB
 p11-kit-trust                     x86_64 0.25.8-1.fc43               fedora     446.5 KiB
 package-notes-srpm-macros         noarch 0.5-14.fc43                 fedora       1.6 KiB
 pam-libs                          x86_64 1.7.1-3.fc43                fedora     126.8 KiB
 pcre2                             x86_64 10.46-1.fc43                fedora     697.7 KiB
 pcre2-syntax                      noarch 10.46-1.fc43                fedora     275.3 KiB
 perl-srpm-macros                  noarch 1-60.fc43                   fedora     861.0   B
 pkgconf                           x86_64 2.3.0-3.fc43                fedora      88.5 KiB
 pkgconf-m4                        noarch 2.3.0-3.fc43                fedora      14.4 KiB
 pkgconf-pkg-config                x86_64 2.3.0-3.fc43                fedora     989.0   B
 popt                              x86_64 1.19-9.fc43                 fedora     132.8 KiB
 publicsuffix-list-dafsa           noarch 20250616-2.fc43             fedora      69.1 KiB
 pyproject-srpm-macros             noarch 1.18.4-1.fc43               fedora       1.9 KiB
 python-srpm-macros                noarch 3.14-5.fc43                 fedora      51.5 KiB
 qt5-srpm-macros                   noarch 5.15.17-2.fc43              fedora     500.0   B
 qt6-srpm-macros                   noarch 6.9.2-1.fc43                fedora     464.0   B
 readline                          x86_64 8.3-2.fc43                  fedora     511.7 KiB
 rpm                               x86_64 6.0.0-1.fc43                fedora       3.1 MiB
 rpm-build-libs                    x86_64 6.0.0-1.fc43                fedora     268.4 KiB
 rpm-libs                          x86_64 6.0.0-1.fc43                fedora     933.7 KiB
 rpm-sequoia                       x86_64 1.9.0-2.fc43                fedora       2.5 MiB
 rpm-sign-libs                     x86_64 6.0.0-1.fc43                fedora      39.7 KiB
 rust-srpm-macros                  noarch 26.4-1.fc43                 fedora       4.8 KiB
 setup                             noarch 2.15.0-26.fc43              fedora     725.0 KiB
 sqlite-libs                       x86_64 3.50.2-2.fc43               fedora       1.5 MiB
 systemd-libs                      x86_64 258-1.fc43                  fedora       2.3 MiB
 systemd-standalone-sysusers       x86_64 258-1.fc43                  fedora     293.5 KiB
 tpm2-tss                          x86_64 4.1.3-8.fc43                fedora       1.6 MiB
 tree-sitter-srpm-macros           noarch 0.4.2-1.fc43                fedora       8.3 KiB
 util-linux-core                   x86_64 2.41.1-16.fc43              fedora       1.5 MiB
 xxhash-libs                       x86_64 0.8.3-3.fc43                fedora      90.2 KiB
 xz-libs                           x86_64 1:5.8.1-2.fc43              fedora     217.8 KiB
 zig-srpm-macros                   noarch 1-5.fc43                    fedora       1.1 KiB
 zip                               x86_64 3.0-44.fc43                 fedora     694.5 KiB
 zlib-ng-compat                    x86_64 2.2.5-2.fc43                fedora     137.6 KiB
 zstd                              x86_64 1.5.7-2.fc43                fedora       1.7 MiB
Installing groups:
 Buildsystem building group                                                               

Transaction Summary:
 Installing:       169 packages

Total size of inbound packages is 59 MiB. Need to download 59 MiB.
After this operation, 198 MiB extra will be used (install 198 MiB, remove 0 B).
[  1/169] bzip2-0:1.0.8-21.fc43.x86_64  100% |   3.6 MiB/s |  51.6 KiB |  00m00s
[  2/169] bash-0:5.3.0-2.fc43.x86_64    100% |  89.0 MiB/s |   1.9 MiB |  00m00s
[  3/169] cpio-0:2.15-6.fc43.x86_64     100% |  40.9 MiB/s | 293.1 KiB |  00m00s
[  4/169] coreutils-0:9.7-5.fc43.x86_64 100% |  49.6 MiB/s |   1.1 MiB |  00m00s
[  5/169] fedora-release-common-0:43-0. 100% |   6.1 MiB/s |  25.0 KiB |  00m00s
[  6/169] findutils-1:4.10.0-6.fc43.x86 100% | 134.3 MiB/s | 550.0 KiB |  00m00s
[  7/169] diffutils-0:3.12-3.fc43.x86_6 100% |  47.9 MiB/s | 392.3 KiB |  00m00s
[  8/169] glibc-minimal-langpack-0:2.42 100% |   9.3 MiB/s |  38.3 KiB |  00m00s
[  9/169] grep-0:3.12-2.fc43.x86_64     100% |  58.4 MiB/s | 299.1 KiB |  00m00s
[ 10/169] gzip-0:1.13-4.fc43.x86_64     100% |  41.5 MiB/s | 170.1 KiB |  00m00s
[ 11/169] info-0:7.2-6.fc43.x86_64      100% |  35.7 MiB/s | 182.9 KiB |  00m00s
[ 12/169] patch-0:2.8-2.fc43.x86_64     100% |  37.0 MiB/s | 113.8 KiB |  00m00s
[ 13/169] redhat-rpm-config-0:343-11.fc 100% |  25.8 MiB/s |  79.1 KiB |  00m00s
[ 14/169] rpm-build-0:6.0.0-1.fc43.x86_ 100% |  44.9 MiB/s | 138.0 KiB |  00m00s
[ 15/169] sed-0:4.9-5.fc43.x86_64       100% |  77.4 MiB/s | 317.1 KiB |  00m00s
[ 16/169] tar-2:1.35-6.fc43.x86_64      100% | 119.5 MiB/s | 856.4 KiB |  00m00s
[ 17/169] shadow-utils-2:4.18.0-3.fc43. 100% | 116.6 MiB/s |   1.3 MiB |  00m00s
[ 18/169] unzip-0:6.0-67.fc43.x86_64    100% |  22.4 MiB/s | 183.7 KiB |  00m00s
[ 19/169] which-0:2.23-3.fc43.x86_64    100% |  13.6 MiB/s |  41.7 KiB |  00m00s
[ 20/169] xz-1:5.8.1-2.fc43.x86_64      100% |  55.9 MiB/s | 572.5 KiB |  00m00s
[ 21/169] gawk-0:5.3.2-2.fc43.x86_64    100% | 102.2 MiB/s |   1.1 MiB |  00m00s
[ 22/169] util-linux-0:2.41.1-16.fc43.x 100% | 108.3 MiB/s |   1.2 MiB |  00m00s
[ 23/169] filesystem-0:3.18-50.fc43.x86 100% | 133.4 MiB/s |   1.3 MiB |  00m00s
[ 24/169] glibc-0:2.42-4.fc43.x86_64    100% | 183.7 MiB/s |   2.2 MiB |  00m00s
[ 25/169] ncurses-libs-0:6.5-7.20250614 100% |  29.5 MiB/s | 332.7 KiB |  00m00s
[ 26/169] bzip2-libs-0:1.0.8-21.fc43.x8 100% |  10.5 MiB/s |  43.1 KiB |  00m00s
[ 27/169] libacl-0:2.3.2-4.fc43.x86_64  100% |  11.9 MiB/s |  24.3 KiB |  00m00s
[ 28/169] gmp-1:6.3.0-4.fc43.x86_64     100% | 103.9 MiB/s | 319.3 KiB |  00m00s
[ 29/169] coreutils-common-0:9.7-5.fc43 100% | 233.4 MiB/s |   2.1 MiB |  00m00s
[ 30/169] libattr-0:2.5.2-6.fc43.x86_64 100% |   2.9 MiB/s |  17.9 KiB |  00m00s
[ 31/169] libcap-0:2.76-3.fc43.x86_64   100% |  17.0 MiB/s |  86.9 KiB |  00m00s
[ 32/169] fedora-repos-0:43-0.4.noarch  100% |   4.4 MiB/s |   9.1 KiB |  00m00s
[ 33/169] libselinux-0:3.9-5.fc43.x86_6 100% |  31.8 MiB/s |  97.7 KiB |  00m00s
[ 34/169] systemd-libs-0:258-1.fc43.x86 100% | 114.4 MiB/s | 819.8 KiB |  00m00s
[ 35/169] glibc-common-0:2.42-4.fc43.x8 100% |  52.9 MiB/s | 325.2 KiB |  00m00s
[ 36/169] pcre2-0:10.46-1.fc43.x86_64   100% |  36.6 MiB/s | 262.2 KiB |  00m00s
[ 37/169] ed-0:1.22.2-1.fc43.x86_64     100% |  16.3 MiB/s |  83.7 KiB |  00m00s
[ 38/169] ansible-srpm-macros-0:1-18.1. 100% |   4.9 MiB/s |  19.9 KiB |  00m00s
[ 39/169] build-reproducibility-srpm-ma 100% |   3.8 MiB/s |  11.8 KiB |  00m00s
[ 40/169] dwz-0:0.16-2.fc43.x86_64      100% |  44.1 MiB/s | 135.5 KiB |  00m00s
[ 41/169] efi-srpm-macros-0:6-4.fc43.no 100% |   7.3 MiB/s |  22.4 KiB |  00m00s
[ 42/169] file-0:5.46-8.fc43.x86_64     100% |  15.9 MiB/s |  48.8 KiB |  00m00s
[ 43/169] filesystem-srpm-macros-0:3.18 100% |   6.4 MiB/s |  26.4 KiB |  00m00s
[ 44/169] forge-srpm-macros-0:0.4.0-3.f 100% |   9.8 MiB/s |  20.1 KiB |  00m00s
[ 45/169] fonts-srpm-macros-1:2.0.5-23. 100% |   8.8 MiB/s |  27.2 KiB |  00m00s
[ 46/169] fpc-srpm-macros-0:1.3-15.fc43 100% |   7.7 MiB/s |   7.9 KiB |  00m00s
[ 47/169] gap-srpm-macros-0:1-1.fc43.no 100% |   4.2 MiB/s |   8.6 KiB |  00m00s
[ 48/169] gnat-srpm-macros-0:6-8.fc43.n 100% |   2.1 MiB/s |   8.5 KiB |  00m00s
[ 49/169] go-srpm-macros-0:3.8.0-1.fc43 100% |  13.8 MiB/s |  28.3 KiB |  00m00s
[ 50/169] java-srpm-macros-0:1-7.fc43.n 100% |   7.8 MiB/s |   7.9 KiB |  00m00s
[ 51/169] lua-srpm-macros-0:1-16.fc43.n 100% |   8.6 MiB/s |   8.8 KiB |  00m00s
[ 52/169] ghc-srpm-macros-0:1.9.2-3.fc4 100% |   1.1 MiB/s |   8.7 KiB |  00m00s
[ 53/169] kernel-srpm-macros-0:1.0-27.f 100% |   2.9 MiB/s |   8.9 KiB |  00m00s
[ 54/169] package-notes-srpm-macros-0:0 100% |   8.8 MiB/s |   9.0 KiB |  00m00s
[ 55/169] openblas-srpm-macros-0:2-20.f 100% |   1.9 MiB/s |   7.6 KiB |  00m00s
[ 56/169] ocaml-srpm-macros-0:11-2.fc43 100% |   1.8 MiB/s |   9.3 KiB |  00m00s
[ 57/169] perl-srpm-macros-0:1-60.fc43. 100% |   4.0 MiB/s |   8.3 KiB |  00m00s
[ 58/169] pyproject-srpm-macros-0:1.18. 100% |   6.7 MiB/s |  13.7 KiB |  00m00s
[ 59/169] qt5-srpm-macros-0:5.15.17-2.f 100% |   8.5 MiB/s |   8.7 KiB |  00m00s
[ 60/169] python-srpm-macros-0:3.14-5.f 100% |  11.4 MiB/s |  23.4 KiB |  00m00s
[ 61/169] rust-srpm-macros-0:26.4-1.fc4 100% |   5.4 MiB/s |  11.1 KiB |  00m00s
[ 62/169] qt6-srpm-macros-0:6.9.2-1.fc4 100% |   3.1 MiB/s |   9.4 KiB |  00m00s
[ 63/169] rpm-0:6.0.0-1.fc43.x86_64     100% | 112.6 MiB/s | 576.3 KiB |  00m00s
[ 64/169] zig-srpm-macros-0:1-5.fc43.no 100% |   2.7 MiB/s |   8.4 KiB |  00m00s
[ 65/169] tree-sitter-srpm-macros-0:0.4 100% |   2.6 MiB/s |  13.4 KiB |  00m00s
[ 66/169] zip-0:3.0-44.fc43.x86_64      100% |  85.1 MiB/s | 261.6 KiB |  00m00s
[ 67/169] debugedit-0:5.2-3.fc43.x86_64 100% |  27.9 MiB/s |  85.6 KiB |  00m00s
[ 68/169] elfutils-0:0.193-3.fc43.x86_6 100% | 139.5 MiB/s | 571.3 KiB |  00m00s
[ 69/169] elfutils-libelf-0:0.193-3.fc4 100% |  50.7 MiB/s | 207.8 KiB |  00m00s
[ 70/169] libarchive-0:3.8.1-3.fc43.x86 100% |  82.3 MiB/s | 421.1 KiB |  00m00s
[ 71/169] libgcc-0:15.2.1-2.fc43.x86_64 100% |  26.0 MiB/s | 133.0 KiB |  00m00s
[ 72/169] libstdc++-0:15.2.1-2.fc43.x86 100% | 149.8 MiB/s | 920.1 KiB |  00m00s
[ 73/169] popt-0:1.19-9.fc43.x86_64     100% |   9.2 MiB/s |  65.7 KiB |  00m00s
[ 74/169] readline-0:8.3-2.fc43.x86_64  100% |  31.3 MiB/s | 224.6 KiB |  00m00s
[ 75/169] rpm-build-libs-0:6.0.0-1.fc43 100% |  31.2 MiB/s | 127.9 KiB |  00m00s
[ 76/169] rpm-libs-0:6.0.0-1.fc43.x86_6 100% |  78.2 MiB/s | 400.2 KiB |  00m00s
[ 77/169] zstd-0:1.5.7-2.fc43.x86_64    100% |  79.1 MiB/s | 485.9 KiB |  00m00s
[ 78/169] audit-libs-0:4.1.1-2.fc43.x86 100% |  22.5 MiB/s | 138.5 KiB |  00m00s
[ 79/169] libeconf-0:0.7.9-2.fc43.x86_6 100% |   8.6 MiB/s |  35.2 KiB |  00m00s
[ 80/169] libsemanage-0:3.9-4.fc43.x86_ 100% |  40.2 MiB/s | 123.5 KiB |  00m00s
[ 81/169] pam-libs-0:1.7.1-3.fc43.x86_6 100% |  28.1 MiB/s |  57.5 KiB |  00m00s
[ 82/169] libxcrypt-0:4.4.38-8.fc43.x86 100% |  31.0 MiB/s | 127.0 KiB |  00m00s
[ 83/169] setup-0:2.15.0-26.fc43.noarch 100% |  51.2 MiB/s | 157.3 KiB |  00m00s
[ 84/169] xz-libs-1:5.8.1-2.fc43.x86_64 100% |  36.8 MiB/s | 112.9 KiB |  00m00s
[ 85/169] mpfr-0:4.2.2-2.fc43.x86_64    100% | 112.9 MiB/s | 347.0 KiB |  00m00s
[ 86/169] libcap-ng-0:0.8.5-7.fc43.x86_ 100% |  15.7 MiB/s |  32.2 KiB |  00m00s
[ 87/169] libblkid-0:2.41.1-16.fc43.x86 100% |  40.1 MiB/s | 123.1 KiB |  00m00s
[ 88/169] libfdisk-0:2.41.1-16.fc43.x86 100% |  52.5 MiB/s | 161.1 KiB |  00m00s
[ 89/169] liblastlog2-0:2.41.1-16.fc43. 100% |   7.5 MiB/s |  23.2 KiB |  00m00s
[ 90/169] libmount-0:2.41.1-16.fc43.x86 100% |  39.7 MiB/s | 162.4 KiB |  00m00s
[ 91/169] libuuid-0:2.41.1-16.fc43.x86_ 100% |   8.5 MiB/s |  26.0 KiB |  00m00s
[ 92/169] libsmartcols-0:2.41.1-16.fc43 100% |  20.5 MiB/s |  84.0 KiB |  00m00s
[ 93/169] util-linux-core-0:2.41.1-16.f 100% | 134.5 MiB/s | 550.9 KiB |  00m00s
[ 94/169] zlib-ng-compat-0:2.2.5-2.fc43 100% |  25.8 MiB/s |  79.2 KiB |  00m00s
[ 95/169] glibc-gconv-extra-0:2.42-4.fc 100% | 198.1 MiB/s |   1.6 MiB |  00m00s
[ 96/169] libsepol-0:3.9-2.fc43.x86_64  100% |  56.2 MiB/s | 345.4 KiB |  00m00s
[ 97/169] ncurses-base-0:6.5-7.20250614 100% |  14.4 MiB/s |  88.2 KiB |  00m00s
[ 98/169] fedora-gpg-keys-0:43-0.4.noar 100% |  45.2 MiB/s | 138.8 KiB |  00m00s
[ 99/169] pcre2-syntax-0:10.46-1.fc43.n 100% |  52.8 MiB/s | 162.2 KiB |  00m00s
[100/169] add-determinism-0:0.6.0-2.fc4 100% | 149.6 MiB/s | 919.3 KiB |  00m00s
[101/169] curl-0:8.15.0-2.fc43.x86_64   100% |  38.0 MiB/s | 233.7 KiB |  00m00s
[102/169] file-libs-0:5.46-8.fc43.x86_6 100% | 103.8 MiB/s | 850.3 KiB |  00m00s
[103/169] elfutils-libs-0:0.193-3.fc43. 100% |  43.9 MiB/s | 269.7 KiB |  00m00s
[104/169] elfutils-debuginfod-client-0: 100% |  15.2 MiB/s |  46.8 KiB |  00m00s
[105/169] libzstd-0:1.5.7-2.fc43.x86_64 100% |  76.8 MiB/s | 314.6 KiB |  00m00s
[106/169] libxml2-0:2.12.10-4.fc43.x86_ 100% | 135.2 MiB/s | 692.5 KiB |  00m00s
[107/169] lz4-libs-0:1.10.0-3.fc43.x86_ 100% |  15.2 MiB/s |  78.0 KiB |  00m00s
[108/169] libgomp-0:15.2.1-2.fc43.x86_6 100% |  60.7 MiB/s | 372.9 KiB |  00m00s
[109/169] rpm-sign-libs-0:6.0.0-1.fc43. 100% |   9.2 MiB/s |  28.2 KiB |  00m00s
[110/169] lua-libs-0:5.4.8-2.fc43.x86_6 100% |  42.9 MiB/s | 131.7 KiB |  00m00s
[111/169] elfutils-default-yama-scope-0 100% |  12.1 MiB/s |  12.4 KiB |  00m00s
[112/169] rpm-sequoia-0:1.9.0-2.fc43.x8 100% | 151.9 MiB/s | 933.3 KiB |  00m00s
[113/169] sqlite-libs-0:3.50.2-2.fc43.x 100% | 123.8 MiB/s | 760.5 KiB |  00m00s
[114/169] json-c-0:0.18-7.fc43.x86_64   100% |   8.8 MiB/s |  45.0 KiB |  00m00s
[115/169] ima-evm-utils-libs-0:1.6.2-6. 100% |   7.2 MiB/s |  29.3 KiB |  00m00s
[116/169] libfsverity-0:1.6-3.fc43.x86_ 100% |   3.6 MiB/s |  18.6 KiB |  00m00s
[117/169] gnupg2-0:2.4.8-4.fc43.x86_64  100% | 164.4 MiB/s |   1.6 MiB |  00m00s
[118/169] gpgverify-0:2.2-3.fc43.noarch 100% |   3.6 MiB/s |  11.1 KiB |  00m00s
[119/169] gnupg2-dirmngr-0:2.4.8-4.fc43 100% |  53.6 MiB/s | 274.6 KiB |  00m00s
[120/169] gnupg2-gpg-agent-0:2.4.8-4.fc 100% |  38.1 MiB/s | 272.9 KiB |  00m00s
[121/169] openssl-libs-1:3.5.1-2.fc43.x 100% | 163.5 MiB/s |   2.6 MiB |  00m00s
[122/169] gnupg2-gpgconf-0:2.4.8-4.fc43 100% |  16.0 MiB/s | 115.0 KiB |  00m00s
[123/169] gnupg2-keyboxd-0:2.4.8-4.fc43 100% |  15.4 MiB/s |  94.7 KiB |  00m00s
[124/169] gnupg2-verify-0:2.4.8-4.fc43. 100% |  27.9 MiB/s | 171.2 KiB |  00m00s
[125/169] libassuan-0:2.5.7-4.fc43.x86_ 100% |  16.5 MiB/s |  67.4 KiB |  00m00s
[126/169] libgcrypt-0:1.11.1-2.fc43.x86 100% | 116.4 MiB/s | 595.8 KiB |  00m00s
[127/169] libgpg-error-0:1.55-2.fc43.x8 100% |  59.6 MiB/s | 244.3 KiB |  00m00s
[128/169] npth-0:1.8-3.fc43.x86_64      100% |   6.3 MiB/s |  25.7 KiB |  00m00s
[129/169] tpm2-tss-0:4.1.3-8.fc43.x86_6 100% | 104.0 MiB/s | 425.9 KiB |  00m00s
[130/169] crypto-policies-0:20250714-5. 100% |  32.0 MiB/s |  98.5 KiB |  00m00s
[131/169] ca-certificates-0:2025.2.80_v 100% | 158.7 MiB/s | 975.4 KiB |  00m00s
[132/169] libksba-0:1.6.7-4.fc43.x86_64 100% |  39.2 MiB/s | 160.4 KiB |  00m00s
[133/169] gnutls-0:3.8.10-3.fc43.x86_64 100% | 175.3 MiB/s |   1.4 MiB |  00m00s
[134/169] openldap-0:2.6.10-4.fc43.x86_ 100% |  42.3 MiB/s | 259.6 KiB |  00m00s
[135/169] libusb1-0:1.0.29-4.fc43.x86_6 100% |  19.5 MiB/s |  79.9 KiB |  00m00s
[136/169] libidn2-0:2.3.8-2.fc43.x86_64 100% |  56.9 MiB/s | 174.9 KiB |  00m00s
[137/169] libtasn1-0:4.20.0-2.fc43.x86_ 100% |  24.2 MiB/s |  74.5 KiB |  00m00s
[138/169] nettle-0:3.10.1-2.fc43.x86_64 100% | 103.6 MiB/s | 424.2 KiB |  00m00s
[139/169] libunistring-0:1.1-10.fc43.x8 100% |  75.7 MiB/s | 542.9 KiB |  00m00s
[140/169] p11-kit-0:0.25.8-1.fc43.x86_6 100% |  82.0 MiB/s | 503.8 KiB |  00m00s
[141/169] cyrus-sasl-lib-0:2.1.28-33.fc 100% | 153.9 MiB/s | 787.9 KiB |  00m00s
[142/169] libtool-ltdl-0:2.5.4-7.fc43.x 100% |  11.8 MiB/s |  36.2 KiB |  00m00s
[143/169] libevent-0:2.1.12-16.fc43.x86 100% |  42.0 MiB/s | 257.8 KiB |  00m00s
[144/169] libffi-0:3.5.1-2.fc43.x86_64  100% |  10.0 MiB/s |  40.9 KiB |  00m00s
[145/169] alternatives-0:1.33-2.fc43.x8 100% |  19.9 MiB/s |  40.7 KiB |  00m00s
[146/169] gdbm-libs-1:1.23-10.fc43.x86_ 100% |   9.2 MiB/s |  56.8 KiB |  00m00s
[147/169] jansson-0:2.14-3.fc43.x86_64  100% |   6.3 MiB/s |  45.3 KiB |  00m00s
[148/169] pkgconf-pkg-config-0:2.3.0-3. 100% |   1.0 MiB/s |   9.6 KiB |  00m00s
[149/169] pkgconf-0:2.3.0-3.fc43.x86_64 100% |   8.7 MiB/s |  44.6 KiB |  00m00s
[150/169] pkgconf-m4-0:2.3.0-3.fc43.noa 100% |   3.4 MiB/s |  13.9 KiB |  00m00s
[151/169] libpkgconf-0:2.3.0-3.fc43.x86 100% |  12.3 MiB/s |  37.9 KiB |  00m00s
[152/169] p11-kit-trust-0:0.25.8-1.fc43 100% |  27.3 MiB/s | 139.6 KiB |  00m00s
[153/169] fedora-release-0:43-0.22.noar 100% |   3.4 MiB/s |  14.0 KiB |  00m00s
[154/169] binutils-0:2.45-1.fc43.x86_64 100% | 183.5 MiB/s |   5.9 MiB |  00m00s
[155/169] systemd-standalone-sysusers-0 100% |  17.6 MiB/s | 143.8 KiB |  00m00s
[156/169] xxhash-libs-0:0.8.3-3.fc43.x8 100% |   7.5 MiB/s |  38.5 KiB |  00m00s
[157/169] fedora-release-identity-basic 100% |   4.8 MiB/s |  14.7 KiB |  00m00s
[158/169] libcurl-0:8.15.0-2.fc43.x86_6 100% |  79.0 MiB/s | 404.3 KiB |  00m00s
[159/169] krb5-libs-0:1.21.3-7.fc43.x86 100% | 105.9 MiB/s | 758.9 KiB |  00m00s
[160/169] gdb-minimal-0:16.3-6.fc43.x86 100% | 176.3 MiB/s |   4.4 MiB |  00m00s
[161/169] libbrotli-0:1.1.0-10.fc43.x86 100% |  36.8 MiB/s | 339.1 KiB |  00m00s
[162/169] libnghttp2-0:1.66.0-2.fc43.x8 100% |  10.1 MiB/s |  72.5 KiB |  00m00s
[163/169] keyutils-libs-0:1.6.3-6.fc43. 100% |  15.3 MiB/s |  31.4 KiB |  00m00s
[164/169] libpsl-0:0.21.5-6.fc43.x86_64 100% |  15.9 MiB/s |  65.0 KiB |  00m00s
[165/169] libssh-0:0.11.3-1.fc43.x86_64 100% |  56.8 MiB/s | 232.8 KiB |  00m00s
[166/169] libcom_err-0:1.47.3-2.fc43.x8 100% |  13.1 MiB/s |  26.8 KiB |  00m00s
[167/169] libverto-0:0.3.2-11.fc43.x86_ 100% |  10.1 MiB/s |  20.7 KiB |  00m00s
[168/169] publicsuffix-list-dafsa-0:202 100% |  28.9 MiB/s |  59.2 KiB |  00m00s
[169/169] libssh-config-0:0.11.3-1.fc43 100% |   4.4 MiB/s |   9.1 KiB |  00m00s
--------------------------------------------------------------------------------
[169/169] Total                         100% | 175.4 MiB/s |  59.0 MiB |  00m00s
Running transaction
Importing OpenPGP key 0x31645531:
 UserID     : "Fedora (43) <fedora-43-primary@fedoraproject.org>"
 Fingerprint: C6E7F081CF80E13146676E88829B606631645531
 From       : file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-43-primary
The key was successfully imported.
[  1/171] Verify package files          100% | 789.0   B/s | 169.0   B |  00m00s
[  2/171] Prepare transaction           100% |   4.2 KiB/s | 169.0   B |  00m00s
[  3/171] Installing libgcc-0:15.2.1-2. 100% | 262.0 MiB/s | 268.3 KiB |  00m00s
[  4/171] Installing libssh-config-0:0. 100% |   0.0   B/s | 816.0   B |  00m00s
[  5/171] Installing publicsuffix-list- 100% |   0.0   B/s |  69.8 KiB |  00m00s
[  6/171] Installing fedora-release-ide 100% |   0.0   B/s | 916.0   B |  00m00s
[  7/171] Installing fedora-gpg-keys-0: 100% |  58.3 MiB/s | 179.0 KiB |  00m00s
[  8/171] Installing fedora-repos-0:43- 100% |   0.0   B/s |   5.7 KiB |  00m00s
[  9/171] Installing fedora-release-com 100% |  24.2 MiB/s |  24.7 KiB |  00m00s
[ 10/171] Installing fedora-release-0:4 100% |  17.3 KiB/s | 124.0   B |  00m00s
>>> Running sysusers scriptlet: setup-0:2.15.0-26.fc43.noarch                   
>>> Finished sysusers scriptlet: setup-0:2.15.0-26.fc43.noarch                  
>>> Scriptlet output:                                                           
>>> Creating group 'adm' with GID 4.                                            
>>> Creating group 'audio' with GID 63.                                         
>>> Creating group 'cdrom' with GID 11.                                         
>>> Creating group 'clock' with GID 103.                                        
>>> Creating group 'dialout' with GID 18.                                       
>>> Creating group 'disk' with GID 6.                                           
>>> Creating group 'floppy' with GID 19.                                        
>>> Creating group 'ftp' with GID 50.                                           
>>> Creating group 'games' with GID 20.                                         
>>> Creating group 'input' with GID 104.                                        
>>> Creating group 'kmem' with GID 9.                                           
>>> Creating group 'kvm' with GID 36.                                           
>>> Creating group 'lock' with GID 54.                                          
>>> Creating group 'lp' with GID 7.                                             
>>> Creating group 'mail' with GID 12.                                          
>>> Creating group 'man' with GID 15.                                           
>>> Creating group 'mem' with GID 8.                                            
>>> Creating group 'nobody' with GID 65534.                                     
>>> Creating group 'render' with GID 105.                                       
>>> Creating group 'root' with GID 0.                                           
>>> Creating group 'sgx' with GID 106.                                          
>>> Creating group 'sys' with GID 3.                                            
>>> Creating group 'tape' with GID 33.                                          
>>> Creating group 'tty' with GID 5.                                            
>>> Creating group 'users' with GID 100.                                        
>>> Creating group 'utmp' with GID 22.                                          
>>> Creating group 'video' with GID 39.                                         
>>> Creating group 'wheel' with GID 10.                                         
>>> Creating user 'adm' (adm) with UID 3 and GID 4.                             
>>> Creating group 'bin' with GID 1.                                            
>>> Creating user 'bin' (bin) with UID 1 and GID 1.                             
>>> Creating group 'daemon' with GID 2.                                         
>>> Creating user 'daemon' (daemon) with UID 2 and GID 2.                       
>>> Creating user 'ftp' (FTP User) with UID 14 and GID 50.                      
>>> Creating user 'games' (games) with UID 12 and GID 100.                      
>>> Creating user 'halt' (halt) with UID 7 and GID 0.                           
>>> Creating user 'lp' (lp) with UID 4 and GID 7.                               
>>> Creating user 'mail' (mail) with UID 8 and GID 12.                          
>>> Creating user 'nobody' (Kernel Overflow User) with UID 65534 and GID 65534. 
>>> Creating user 'operator' (operator) with UID 11 and GID 0.                  
>>> Creating user 'root' (Super User) with UID 0 and GID 0.                     
>>> Creating user 'shutdown' (shutdown) with UID 6 and GID 0.                   
>>> Creating user 'sync' (sync) with UID 5 and GID 0.                           
>>>                                                                             
[ 11/171] Installing setup-0:2.15.0-26. 100% |  54.9 MiB/s | 730.6 KiB |  00m00s
>>> [RPM] /etc/hosts created as /etc/hosts.rpmnew                               
[ 12/171] Installing filesystem-0:3.18- 100% |   3.1 MiB/s | 212.8 KiB |  00m00s
[ 13/171] Installing pkgconf-m4-0:2.3.0 100% |   0.0   B/s |  14.8 KiB |  00m00s
[ 14/171] Installing pcre2-syntax-0:10. 100% | 271.2 MiB/s | 277.8 KiB |  00m00s
[ 15/171] Installing ncurses-base-0:6.5 100% | 115.1 MiB/s | 353.5 KiB |  00m00s
[ 16/171] Installing bash-0:5.3.0-2.fc4 100% | 290.7 MiB/s |   8.4 MiB |  00m00s
[ 17/171] Installing glibc-common-0:2.4 100% |  68.0 MiB/s |   1.0 MiB |  00m00s
[ 18/171] Installing glibc-gconv-extra- 100% | 317.8 MiB/s |   7.3 MiB |  00m00s
[ 19/171] Installing glibc-0:2.42-4.fc4 100% | 203.1 MiB/s |   6.7 MiB |  00m00s
[ 20/171] Installing ncurses-libs-0:6.5 100% | 310.1 MiB/s | 952.8 KiB |  00m00s
[ 21/171] Installing glibc-minimal-lang 100% |   0.0   B/s | 124.0   B |  00m00s
[ 22/171] Installing zlib-ng-compat-0:2 100% | 135.2 MiB/s | 138.4 KiB |  00m00s
[ 23/171] Installing bzip2-libs-0:1.0.8 100% |  79.8 MiB/s |  81.7 KiB |  00m00s
[ 24/171] Installing libgpg-error-0:1.5 100% |  69.2 MiB/s | 921.1 KiB |  00m00s
[ 25/171] Installing libstdc++-0:15.2.1 100% | 406.3 MiB/s |   2.8 MiB |  00m00s
[ 26/171] Installing xz-libs-1:5.8.1-2. 100% | 213.8 MiB/s | 218.9 KiB |  00m00s
[ 27/171] Installing libassuan-0:2.5.7- 100% | 161.7 MiB/s | 165.6 KiB |  00m00s
[ 28/171] Installing libgcrypt-0:1.11.1 100% | 393.8 MiB/s |   1.6 MiB |  00m00s
[ 29/171] Installing readline-0:8.3-2.f 100% | 501.8 MiB/s | 513.9 KiB |  00m00s
[ 30/171] Installing gmp-1:6.3.0-4.fc43 100% | 397.2 MiB/s | 813.5 KiB |  00m00s
[ 31/171] Installing libuuid-0:2.41.1-1 100% |   0.0   B/s |  38.3 KiB |  00m00s
[ 32/171] Installing popt-0:1.19-9.fc43 100% |  68.1 MiB/s | 139.4 KiB |  00m00s
[ 33/171] Installing npth-0:1.8-3.fc43. 100% |   0.0   B/s |  50.7 KiB |  00m00s
[ 34/171] Installing libblkid-0:2.41.1- 100% | 257.4 MiB/s | 263.5 KiB |  00m00s
[ 35/171] Installing libxcrypt-0:4.4.38 100% | 280.4 MiB/s | 287.2 KiB |  00m00s
[ 36/171] Installing libzstd-0:1.5.7-2. 100% | 391.2 MiB/s | 801.1 KiB |  00m00s
[ 37/171] Installing elfutils-libelf-0: 100% | 388.8 MiB/s |   1.2 MiB |  00m00s
[ 38/171] Installing sqlite-libs-0:3.50 100% | 379.1 MiB/s |   1.5 MiB |  00m00s
[ 39/171] Installing gnupg2-gpgconf-0:2 100% |  22.4 MiB/s | 252.0 KiB |  00m00s
[ 40/171] Installing libattr-0:2.5.2-6. 100% |   0.0   B/s |  25.4 KiB |  00m00s
[ 41/171] Installing libacl-0:2.3.2-4.f 100% |   0.0   B/s |  36.8 KiB |  00m00s
[ 42/171] Installing libtasn1-0:4.20.0- 100% | 173.9 MiB/s | 178.1 KiB |  00m00s
[ 43/171] Installing libunistring-0:1.1 100% | 431.7 MiB/s |   1.7 MiB |  00m00s
[ 44/171] Installing libidn2-0:2.3.8-2. 100% |  68.2 MiB/s | 558.7 KiB |  00m00s
[ 45/171] Installing crypto-policies-0: 100% |  42.0 MiB/s | 172.0 KiB |  00m00s
[ 46/171] Installing dwz-0:0.16-2.fc43. 100% |  23.5 MiB/s | 288.5 KiB |  00m00s
[ 47/171] Installing gnupg2-verify-0:2. 100% |  28.5 MiB/s | 349.9 KiB |  00m00s
[ 48/171] Installing mpfr-0:4.2.2-2.fc4 100% | 407.4 MiB/s | 834.4 KiB |  00m00s
[ 49/171] Installing gawk-0:5.3.2-2.fc4 100% | 113.5 MiB/s |   1.8 MiB |  00m00s
[ 50/171] Installing libksba-0:1.6.7-4. 100% | 391.7 MiB/s | 401.1 KiB |  00m00s
[ 51/171] Installing unzip-0:6.0-67.fc4 100% |  31.7 MiB/s | 389.8 KiB |  00m00s
[ 52/171] Installing file-libs-0:5.46-8 100% | 741.1 MiB/s |  11.9 MiB |  00m00s
[ 53/171] Installing file-0:5.46-8.fc43 100% |   9.0 MiB/s | 101.7 KiB |  00m00s
[ 54/171] Installing pcre2-0:10.46-1.fc 100% | 341.4 MiB/s | 699.1 KiB |  00m00s
[ 55/171] Installing grep-0:3.12-2.fc43 100% |  71.6 MiB/s |   1.0 MiB |  00m00s
[ 56/171] Installing xz-1:5.8.1-2.fc43. 100% |  88.8 MiB/s |   1.3 MiB |  00m00s
[ 57/171] Installing libeconf-0:0.7.9-2 100% |   0.0   B/s |  66.5 KiB |  00m00s
[ 58/171] Installing libcap-ng-0:0.8.5- 100% |  69.2 MiB/s |  70.8 KiB |  00m00s
[ 59/171] Installing audit-libs-0:4.1.1 100% | 372.6 MiB/s | 381.5 KiB |  00m00s
[ 60/171] Installing pam-libs-0:1.7.1-3 100% | 126.0 MiB/s | 129.0 KiB |  00m00s
[ 61/171] Installing libcap-0:2.76-3.fc 100% |  17.4 MiB/s | 214.3 KiB |  00m00s
[ 62/171] Installing systemd-libs-0:258 100% | 387.5 MiB/s |   2.3 MiB |  00m00s
[ 63/171] Installing libsmartcols-0:2.4 100% | 177.3 MiB/s | 181.6 KiB |  00m00s
[ 64/171] Installing libsepol-0:3.9-2.f 100% | 401.8 MiB/s | 822.9 KiB |  00m00s
[ 65/171] Installing libselinux-0:3.9-5 100% | 189.8 MiB/s | 194.4 KiB |  00m00s
[ 66/171] Installing findutils-1:4.10.0 100% | 123.9 MiB/s |   1.9 MiB |  00m00s
[ 67/171] Installing sed-0:4.9-5.fc43.x 100% |  60.4 MiB/s | 865.5 KiB |  00m00s
[ 68/171] Installing libmount-0:2.41.1- 100% | 364.9 MiB/s | 373.7 KiB |  00m00s
[ 69/171] Installing lz4-libs-0:1.10.0- 100% | 158.6 MiB/s | 162.5 KiB |  00m00s
[ 70/171] Installing lua-libs-0:5.4.8-2 100% | 275.3 MiB/s | 281.9 KiB |  00m00s
[ 71/171] Installing json-c-0:0.18-7.fc 100% |   0.0   B/s |  84.0 KiB |  00m00s
[ 72/171] Installing libffi-0:3.5.1-2.f 100% |  83.0 MiB/s |  85.0 KiB |  00m00s
[ 73/171] Installing p11-kit-0:0.25.8-1 100% | 127.3 MiB/s |   2.3 MiB |  00m00s
[ 74/171] Installing alternatives-0:1.3 100% |   5.7 MiB/s |  63.8 KiB |  00m00s
[ 75/171] Installing p11-kit-trust-0:0. 100% |  24.3 MiB/s | 448.2 KiB |  00m00s
[ 76/171] Installing zstd-0:1.5.7-2.fc4 100% | 114.0 MiB/s |   1.7 MiB |  00m00s
[ 77/171] Installing util-linux-core-0: 100% |  92.5 MiB/s |   1.5 MiB |  00m00s
[ 78/171] Installing tar-2:1.35-6.fc43. 100% | 164.3 MiB/s |   3.0 MiB |  00m00s
[ 79/171] Installing libsemanage-0:3.9- 100% | 303.0 MiB/s | 310.2 KiB |  00m00s
[ 80/171] Installing systemd-standalone 100% |  26.1 MiB/s | 294.1 KiB |  00m00s
[ 81/171] Installing libusb1-0:1.0.29-4 100% | 168.9 MiB/s | 172.9 KiB |  00m00s
[ 82/171] Installing zip-0:3.0-44.fc43. 100% |  56.8 MiB/s | 698.4 KiB |  00m00s
[ 83/171] Installing gnupg2-keyboxd-0:2 100% |  39.6 MiB/s | 202.7 KiB |  00m00s
[ 84/171] Installing libpsl-0:0.21.5-6. 100% |  75.7 MiB/s |  77.5 KiB |  00m00s
[ 85/171] Installing liblastlog2-0:2.41 100% |   7.0 MiB/s |  36.0 KiB |  00m00s
[ 86/171] Installing libfdisk-0:2.41.1- 100% | 186.3 MiB/s | 381.5 KiB |  00m00s
[ 87/171] Installing nettle-0:3.10.1-2. 100% | 387.5 MiB/s | 793.7 KiB |  00m00s
[ 88/171] Installing gnutls-0:3.8.10-3. 100% | 383.9 MiB/s |   3.8 MiB |  00m00s
[ 89/171] Installing libxml2-0:2.12.10- 100% | 106.5 MiB/s |   1.7 MiB |  00m00s
[ 90/171] Installing bzip2-0:1.0.8-21.f 100% |   8.9 MiB/s |  99.8 KiB |  00m00s
[ 91/171] Installing add-determinism-0: 100% | 152.7 MiB/s |   2.4 MiB |  00m00s
[ 92/171] Installing build-reproducibil 100% |   0.0   B/s |   1.0 KiB |  00m00s
[ 93/171] Installing cpio-0:2.15-6.fc43 100% |  78.5 MiB/s |   1.1 MiB |  00m00s
[ 94/171] Installing diffutils-0:3.12-3 100% | 104.1 MiB/s |   1.6 MiB |  00m00s
[ 95/171] Installing ed-0:1.22.2-1.fc43 100% |  13.3 MiB/s | 150.4 KiB |  00m00s
[ 96/171] Installing patch-0:2.8-2.fc43 100% |  19.9 MiB/s | 224.3 KiB |  00m00s
[ 97/171] Installing libgomp-0:15.2.1-2 100% | 264.9 MiB/s | 542.5 KiB |  00m00s
[ 98/171] Installing libtool-ltdl-0:2.5 100% |   0.0   B/s |  71.2 KiB |  00m00s
[ 99/171] Installing gdbm-libs-1:1.23-1 100% | 128.5 MiB/s | 131.6 KiB |  00m00s
[100/171] Installing cyrus-sasl-lib-0:2 100% | 143.5 MiB/s |   2.3 MiB |  00m00s
[101/171] Installing jansson-0:2.14-3.f 100% |  88.3 MiB/s |  90.5 KiB |  00m00s
[102/171] Installing libpkgconf-0:2.3.0 100% |   0.0   B/s |  79.2 KiB |  00m00s
[103/171] Installing pkgconf-0:2.3.0-3. 100% |   8.1 MiB/s |  91.0 KiB |  00m00s
[104/171] Installing pkgconf-pkg-config 100% | 177.3 KiB/s |   1.8 KiB |  00m00s
[105/171] Installing xxhash-libs-0:0.8. 100% |  89.4 MiB/s |  91.6 KiB |  00m00s
[106/171] Installing libbrotli-0:1.1.0- 100% | 272.0 MiB/s | 835.6 KiB |  00m00s
[107/171] Installing libnghttp2-0:1.66. 100% | 159.5 MiB/s | 163.3 KiB |  00m00s
[108/171] Installing keyutils-libs-0:1. 100% |  54.4 MiB/s |  55.7 KiB |  00m00s
[109/171] Installing libcom_err-0:1.47. 100% |   0.0   B/s |  64.2 KiB |  00m00s
[110/171] Installing libverto-0:0.3.2-1 100% |   0.0   B/s |  27.2 KiB |  00m00s
[111/171] Installing filesystem-srpm-ma 100% |   0.0   B/s |  38.9 KiB |  00m00s
[112/171] Installing elfutils-default-y 100% | 408.6 KiB/s |   2.0 KiB |  00m00s
[113/171] Installing elfutils-libs-0:0. 100% | 334.6 MiB/s | 685.2 KiB |  00m00s
[114/171] Installing rust-srpm-macros-0 100% |   0.0   B/s |   5.6 KiB |  00m00s
[115/171] Installing qt6-srpm-macros-0: 100% |   0.0   B/s | 740.0   B |  00m00s
[116/171] Installing qt5-srpm-macros-0: 100% |   0.0   B/s | 776.0   B |  00m00s
[117/171] Installing perl-srpm-macros-0 100% |   0.0   B/s |   1.1 KiB |  00m00s
[118/171] Installing package-notes-srpm 100% |   0.0   B/s |   2.0 KiB |  00m00s
[119/171] Installing openblas-srpm-macr 100% |   0.0   B/s | 392.0   B |  00m00s
[120/171] Installing ocaml-srpm-macros- 100% |   0.0   B/s |   2.1 KiB |  00m00s
[121/171] Installing kernel-srpm-macros 100% |   0.0   B/s |   2.3 KiB |  00m00s
[122/171] Installing gnat-srpm-macros-0 100% |   0.0   B/s |   1.3 KiB |  00m00s
[123/171] Installing ghc-srpm-macros-0: 100% |   0.0   B/s |   1.0 KiB |  00m00s
[124/171] Installing gap-srpm-macros-0: 100% |   0.0   B/s |   2.6 KiB |  00m00s
[125/171] Installing fpc-srpm-macros-0: 100% |   0.0   B/s | 420.0   B |  00m00s
[126/171] Installing ansible-srpm-macro 100% |  35.4 MiB/s |  36.2 KiB |  00m00s
[127/171] Installing coreutils-common-0 100% | 434.3 MiB/s |  11.3 MiB |  00m00s
[128/171] Installing openssl-libs-1:3.5 100% | 468.6 MiB/s |   8.9 MiB |  00m00s
[129/171] Installing coreutils-0:9.7-5. 100% | 175.7 MiB/s |   5.4 MiB |  00m00s
[130/171] Installing ca-certificates-0: 100% |   2.1 MiB/s |   2.5 MiB |  00m01s
[131/171] Installing libarchive-0:3.8.1 100% | 310.2 MiB/s | 953.1 KiB |  00m00s
[132/171] Installing krb5-libs-0:1.21.3 100% | 176.3 MiB/s |   2.3 MiB |  00m00s
>>> Running sysusers scriptlet: tpm2-tss-0:4.1.3-8.fc43.x86_64                  
>>> Finished sysusers scriptlet: tpm2-tss-0:4.1.3-8.fc43.x86_64                 
>>> Scriptlet output:                                                           
>>> Creating group 'tss' with GID 59.                                           
>>> Creating user 'tss' (Account used for TPM access) with UID 59 and GID 59.   
>>>                                                                             
[133/171] Installing tpm2-tss-0:4.1.3-8 100% | 314.4 MiB/s |   1.6 MiB |  00m00s
[134/171] Installing ima-evm-utils-libs 100% |  60.5 MiB/s |  62.0 KiB |  00m00s
[135/171] Installing gnupg2-gpg-agent-0 100% |  34.7 MiB/s | 675.4 KiB |  00m00s
[136/171] Installing libssh-0:0.11.3-1. 100% | 277.9 MiB/s | 569.2 KiB |  00m00s
[137/171] Installing gzip-0:1.13-4.fc43 100% |  32.1 MiB/s | 394.4 KiB |  00m00s
[138/171] Installing rpm-sequoia-0:1.9. 100% | 413.1 MiB/s |   2.5 MiB |  00m00s
[139/171] Installing rpm-libs-0:6.0.0-1 100% | 304.4 MiB/s | 935.2 KiB |  00m00s
[140/171] Installing libfsverity-0:1.6- 100% |   0.0   B/s |  29.5 KiB |  00m00s
[141/171] Installing libevent-0:2.1.12- 100% | 288.7 MiB/s | 886.8 KiB |  00m00s
[142/171] Installing openldap-0:2.6.10- 100% | 324.1 MiB/s | 663.7 KiB |  00m00s
[143/171] Installing libcurl-0:8.15.0-2 100% | 294.4 MiB/s | 904.3 KiB |  00m00s
[144/171] Installing elfutils-debuginfo 100% |   7.0 MiB/s |  86.2 KiB |  00m00s
[145/171] Installing elfutils-0:0.193-3 100% | 162.1 MiB/s |   2.9 MiB |  00m00s
[146/171] Installing binutils-0:2.45-1. 100% | 358.7 MiB/s |  26.5 MiB |  00m00s
[147/171] Installing gdb-minimal-0:16.3 100% | 323.3 MiB/s |  13.3 MiB |  00m00s
[148/171] Installing debugedit-0:5.2-3. 100% |  17.7 MiB/s | 217.3 KiB |  00m00s
[149/171] Installing curl-0:8.15.0-2.fc 100% |  23.3 MiB/s | 476.3 KiB |  00m00s
[150/171] Installing rpm-0:6.0.0-1.fc43 100% |  88.8 MiB/s |   2.6 MiB |  00m00s
[151/171] Installing efi-srpm-macros-0: 100% |  40.2 MiB/s |  41.1 KiB |  00m00s
[152/171] Installing java-srpm-macros-0 100% |   0.0   B/s |   1.1 KiB |  00m00s
[153/171] Installing lua-srpm-macros-0: 100% |   0.0   B/s |   1.9 KiB |  00m00s
[154/171] Installing tree-sitter-srpm-m 100% |   0.0   B/s |   9.3 KiB |  00m00s
[155/171] Installing zig-srpm-macros-0: 100% |   0.0   B/s |   1.7 KiB |  00m00s
[156/171] Installing gnupg2-dirmngr-0:2 100% |  31.9 MiB/s | 621.1 KiB |  00m00s
[157/171] Installing gnupg2-0:2.4.8-4.f 100% | 242.6 MiB/s |   6.6 MiB |  00m00s
[158/171] Installing rpm-sign-libs-0:6. 100% |  39.6 MiB/s |  40.6 KiB |  00m00s
[159/171] Installing rpm-build-libs-0:6 100% | 262.9 MiB/s | 269.2 KiB |  00m00s
[160/171] Installing gpgverify-0:2.2-3. 100% |   0.0   B/s |   9.4 KiB |  00m00s
[161/171] Installing rpm-build-0:6.0.0- 100% |  24.1 MiB/s | 296.5 KiB |  00m00s
[162/171] Installing pyproject-srpm-mac 100% |   0.0   B/s |   2.5 KiB |  00m00s
[163/171] Installing redhat-rpm-config- 100% | 184.7 MiB/s | 189.1 KiB |  00m00s
[164/171] Installing forge-srpm-macros- 100% |   0.0   B/s |  40.3 KiB |  00m00s
[165/171] Installing fonts-srpm-macros- 100% |   0.0   B/s |  57.0 KiB |  00m00s
[166/171] Installing go-srpm-macros-0:3 100% |   0.0   B/s |  63.0 KiB |  00m00s
[167/171] Installing python-srpm-macros 100% |   0.0   B/s |  52.8 KiB |  00m00s
[168/171] Installing which-0:2.23-3.fc4 100% |   6.4 MiB/s |  85.7 KiB |  00m00s
[169/171] Installing util-linux-0:2.41. 100% | 111.6 MiB/s |   3.6 MiB |  00m00s
[170/171] Installing shadow-utils-2:4.1 100% | 152.7 MiB/s |   4.0 MiB |  00m00s
[171/171] Installing info-0:7.2-6.fc43. 100% | 234.0 KiB/s | 354.3 KiB |  00m02s
Complete!
Finish: installing minimal buildroot with dnf5
Start: creating root cache
Finish: creating root cache
Finish: chroot init
INFO: Installed packages:
INFO: add-determinism-0.6.0-2.fc43.x86_64
alternatives-1.33-2.fc43.x86_64
ansible-srpm-macros-1-18.1.fc43.noarch
audit-libs-4.1.1-2.fc43.x86_64
bash-5.3.0-2.fc43.x86_64
binutils-2.45-1.fc43.x86_64
build-reproducibility-srpm-macros-0.6.0-2.fc43.noarch
bzip2-1.0.8-21.fc43.x86_64
bzip2-libs-1.0.8-21.fc43.x86_64
ca-certificates-2025.2.80_v9.0.304-1.1.fc43.noarch
coreutils-9.7-5.fc43.x86_64
coreutils-common-9.7-5.fc43.x86_64
cpio-2.15-6.fc43.x86_64
crypto-policies-20250714-5.gitcd6043a.fc43.noarch
curl-8.15.0-2.fc43.x86_64
cyrus-sasl-lib-2.1.28-33.fc43.x86_64
debugedit-5.2-3.fc43.x86_64
diffutils-3.12-3.fc43.x86_64
dwz-0.16-2.fc43.x86_64
ed-1.22.2-1.fc43.x86_64
efi-srpm-macros-6-4.fc43.noarch
elfutils-0.193-3.fc43.x86_64
elfutils-debuginfod-client-0.193-3.fc43.x86_64
elfutils-default-yama-scope-0.193-3.fc43.noarch
elfutils-libelf-0.193-3.fc43.x86_64
elfutils-libs-0.193-3.fc43.x86_64
fedora-gpg-keys-43-0.4.noarch
fedora-release-43-0.22.noarch
fedora-release-common-43-0.22.noarch
fedora-release-identity-basic-43-0.22.noarch
fedora-repos-43-0.4.noarch
file-5.46-8.fc43.x86_64
file-libs-5.46-8.fc43.x86_64
filesystem-3.18-50.fc43.x86_64
filesystem-srpm-macros-3.18-50.fc43.noarch
findutils-4.10.0-6.fc43.x86_64
fonts-srpm-macros-2.0.5-23.fc43.noarch
forge-srpm-macros-0.4.0-3.fc43.noarch
fpc-srpm-macros-1.3-15.fc43.noarch
gap-srpm-macros-1-1.fc43.noarch
gawk-5.3.2-2.fc43.x86_64
gdb-minimal-16.3-6.fc43.x86_64
gdbm-libs-1.23-10.fc43.x86_64
ghc-srpm-macros-1.9.2-3.fc43.noarch
glibc-2.42-4.fc43.x86_64
glibc-common-2.42-4.fc43.x86_64
glibc-gconv-extra-2.42-4.fc43.x86_64
glibc-minimal-langpack-2.42-4.fc43.x86_64
gmp-6.3.0-4.fc43.x86_64
gnat-srpm-macros-6-8.fc43.noarch
gnupg2-2.4.8-4.fc43.x86_64
gnupg2-dirmngr-2.4.8-4.fc43.x86_64
gnupg2-gpg-agent-2.4.8-4.fc43.x86_64
gnupg2-gpgconf-2.4.8-4.fc43.x86_64
gnupg2-keyboxd-2.4.8-4.fc43.x86_64
gnupg2-verify-2.4.8-4.fc43.x86_64
gnutls-3.8.10-3.fc43.x86_64
go-srpm-macros-3.8.0-1.fc43.noarch
gpg-pubkey-c6e7f081cf80e13146676e88829b606631645531-66b6dccf
gpgverify-2.2-3.fc43.noarch
grep-3.12-2.fc43.x86_64
gzip-1.13-4.fc43.x86_64
ima-evm-utils-libs-1.6.2-6.fc43.x86_64
info-7.2-6.fc43.x86_64
jansson-2.14-3.fc43.x86_64
java-srpm-macros-1-7.fc43.noarch
json-c-0.18-7.fc43.x86_64
kernel-srpm-macros-1.0-27.fc43.noarch
keyutils-libs-1.6.3-6.fc43.x86_64
krb5-libs-1.21.3-7.fc43.x86_64
libacl-2.3.2-4.fc43.x86_64
libarchive-3.8.1-3.fc43.x86_64
libassuan-2.5.7-4.fc43.x86_64
libattr-2.5.2-6.fc43.x86_64
libblkid-2.41.1-16.fc43.x86_64
libbrotli-1.1.0-10.fc43.x86_64
libcap-2.76-3.fc43.x86_64
libcap-ng-0.8.5-7.fc43.x86_64
libcom_err-1.47.3-2.fc43.x86_64
libcurl-8.15.0-2.fc43.x86_64
libeconf-0.7.9-2.fc43.x86_64
libevent-2.1.12-16.fc43.x86_64
libfdisk-2.41.1-16.fc43.x86_64
libffi-3.5.1-2.fc43.x86_64
libfsverity-1.6-3.fc43.x86_64
libgcc-15.2.1-2.fc43.x86_64
libgcrypt-1.11.1-2.fc43.x86_64
libgomp-15.2.1-2.fc43.x86_64
libgpg-error-1.55-2.fc43.x86_64
libidn2-2.3.8-2.fc43.x86_64
libksba-1.6.7-4.fc43.x86_64
liblastlog2-2.41.1-16.fc43.x86_64
libmount-2.41.1-16.fc43.x86_64
libnghttp2-1.66.0-2.fc43.x86_64
libpkgconf-2.3.0-3.fc43.x86_64
libpsl-0.21.5-6.fc43.x86_64
libselinux-3.9-5.fc43.x86_64
libsemanage-3.9-4.fc43.x86_64
libsepol-3.9-2.fc43.x86_64
libsmartcols-2.41.1-16.fc43.x86_64
libssh-0.11.3-1.fc43.x86_64
libssh-config-0.11.3-1.fc43.noarch
libstdc++-15.2.1-2.fc43.x86_64
libtasn1-4.20.0-2.fc43.x86_64
libtool-ltdl-2.5.4-7.fc43.x86_64
libunistring-1.1-10.fc43.x86_64
libusb1-1.0.29-4.fc43.x86_64
libuuid-2.41.1-16.fc43.x86_64
libverto-0.3.2-11.fc43.x86_64
libxcrypt-4.4.38-8.fc43.x86_64
libxml2-2.12.10-4.fc43.x86_64
libzstd-1.5.7-2.fc43.x86_64
lua-libs-5.4.8-2.fc43.x86_64
lua-srpm-macros-1-16.fc43.noarch
lz4-libs-1.10.0-3.fc43.x86_64
mpfr-4.2.2-2.fc43.x86_64
ncurses-base-6.5-7.20250614.fc43.noarch
ncurses-libs-6.5-7.20250614.fc43.x86_64
nettle-3.10.1-2.fc43.x86_64
npth-1.8-3.fc43.x86_64
ocaml-srpm-macros-11-2.fc43.noarch
openblas-srpm-macros-2-20.fc43.noarch
openldap-2.6.10-4.fc43.x86_64
openssl-libs-3.5.1-2.fc43.x86_64
p11-kit-0.25.8-1.fc43.x86_64
p11-kit-trust-0.25.8-1.fc43.x86_64
package-notes-srpm-macros-0.5-14.fc43.noarch
pam-libs-1.7.1-3.fc43.x86_64
patch-2.8-2.fc43.x86_64
pcre2-10.46-1.fc43.x86_64
pcre2-syntax-10.46-1.fc43.noarch
perl-srpm-macros-1-60.fc43.noarch
pkgconf-2.3.0-3.fc43.x86_64
pkgconf-m4-2.3.0-3.fc43.noarch
pkgconf-pkg-config-2.3.0-3.fc43.x86_64
popt-1.19-9.fc43.x86_64
publicsuffix-list-dafsa-20250616-2.fc43.noarch
pyproject-srpm-macros-1.18.4-1.fc43.noarch
python-srpm-macros-3.14-5.fc43.noarch
qt5-srpm-macros-5.15.17-2.fc43.noarch
qt6-srpm-macros-6.9.2-1.fc43.noarch
readline-8.3-2.fc43.x86_64
redhat-rpm-config-343-11.fc43.noarch
rpm-6.0.0-1.fc43.x86_64
rpm-build-6.0.0-1.fc43.x86_64
rpm-build-libs-6.0.0-1.fc43.x86_64
rpm-libs-6.0.0-1.fc43.x86_64
rpm-sequoia-1.9.0-2.fc43.x86_64
rpm-sign-libs-6.0.0-1.fc43.x86_64
rust-srpm-macros-26.4-1.fc43.noarch
sed-4.9-5.fc43.x86_64
setup-2.15.0-26.fc43.noarch
shadow-utils-4.18.0-3.fc43.x86_64
sqlite-libs-3.50.2-2.fc43.x86_64
systemd-libs-258-1.fc43.x86_64
systemd-standalone-sysusers-258-1.fc43.x86_64
tar-1.35-6.fc43.x86_64
tpm2-tss-4.1.3-8.fc43.x86_64
tree-sitter-srpm-macros-0.4.2-1.fc43.noarch
unzip-6.0-67.fc43.x86_64
util-linux-2.41.1-16.fc43.x86_64
util-linux-core-2.41.1-16.fc43.x86_64
which-2.23-3.fc43.x86_64
xxhash-libs-0.8.3-3.fc43.x86_64
xz-5.8.1-2.fc43.x86_64
xz-libs-5.8.1-2.fc43.x86_64
zig-srpm-macros-1-5.fc43.noarch
zip-3.0-44.fc43.x86_64
zlib-ng-compat-2.2.5-2.fc43.x86_64
zstd-1.5.7-2.fc43.x86_64
Start: buildsrpm
Start: rpmbuild -bs
Building target platforms: x86_64
Building for target x86_64
setting SOURCE_DATE_EPOCH=1759363200
Wrote: /builddir/build/SRPMS/ollama-ggml-cuda-0.12.3-1.fc43.src.rpm
Finish: rpmbuild -bs
INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan
INFO: /var/lib/mock/fedora-43-x86_64-1759434727.591343/root/var/log/dnf5.log
INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz
/bin/tar: Removing leading `/' from member names
Finish: buildsrpm
INFO: Done(/var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda/ollama-ggml-cuda.spec) Config(child) 0 minutes 21 seconds
INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results
INFO: Cleaning up build root ('cleanup_on_success=True')
Start: clean chroot
INFO: unmounting tmpfs.
Finish: clean chroot
INFO: Start(/var/lib/copr-rpmbuild/results/ollama-ggml-cuda-0.12.3-1.fc43.src.rpm)  Config(fedora-43-x86_64)
Start(bootstrap): chroot init
INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1759434727.591343/root.
INFO: reusing tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1759434727.591343/root.
INFO: calling preinit hooks
INFO: enabled root cache
INFO: enabled package manager cache
Start(bootstrap): cleaning package manager metadata
Finish(bootstrap): cleaning package manager metadata
Finish(bootstrap): chroot init
Start: chroot init
INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-1759434727.591343/root.
INFO: calling preinit hooks
INFO: enabled root cache
Start: unpacking root cache
Finish: unpacking root cache
INFO: enabled package manager cache
Start: cleaning package manager metadata
Finish: cleaning package manager metadata
INFO: enabled HW Info plugin
INFO: Buildroot is handled by package management downloaded with a bootstrap image:
  rpm-6.0.0-1.fc43.x86_64
  rpm-sequoia-1.9.0-2.fc43.x86_64
  dnf5-5.2.17.0-2.fc43.x86_64
  dnf5-plugins-5.2.17.0-2.fc43.x86_64
Finish: chroot init
Start: build phase for ollama-ggml-cuda-0.12.3-1.fc43.src.rpm
Start: build setup for ollama-ggml-cuda-0.12.3-1.fc43.src.rpm
Building target platforms: x86_64
Building for target x86_64
setting SOURCE_DATE_EPOCH=1759363200
Wrote: /builddir/build/SRPMS/ollama-ggml-cuda-0.12.3-1.fc43.src.rpm
Updating and loading repositories:
 Additional repo https_developer_downlo 100% |  20.4 KiB/s |   3.9 KiB |  00m00s
 Additional repo https_developer_downlo 100% |  20.4 KiB/s |   3.9 KiB |  00m00s
 Copr repository                        100% |   7.8 KiB/s |   1.5 KiB |  00m00s
 fedora                                 100% |  31.8 KiB/s |  10.3 KiB |  00m00s
 updates                                100% | 117.4 KiB/s |  30.2 KiB |  00m00s
Repositories loaded.
Package                          Arch   Version           Repository                                                                 Size
Installing:
 cmake                           x86_64 3.31.6-4.fc43     fedora                                                                 34.5 MiB
 cuda-compiler-12-9              x86_64 12.9.1-1          https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   0.0   B
 cuda-compiler-13-0              x86_64 13.0.1-1          https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   0.0   B
 cuda-libraries-devel-12-9       x86_64 12.9.1-1          https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   0.0   B
 cuda-libraries-devel-13-0       x86_64 13.0.1-1          https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   0.0   B
 cuda-nvml-devel-12-9            x86_64 12.9.79-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   1.4 MiB
 cuda-nvml-devel-13-0            x86_64 13.0.87-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   1.4 MiB
 gcc-c++                         x86_64 15.2.1-2.fc43     fedora                                                                 41.4 MiB
 gcc14                           x86_64 14.3.1-1.fc43     fedora                                                                117.6 MiB
 gcc14-c++                       x86_64 14.3.1-1.fc43     fedora                                                                124.1 MiB
Installing dependencies:
 annobin-docs                    noarch 12.99-1.fc43      fedora                                                                 98.9 KiB
 annobin-plugin-gcc              x86_64 12.99-1.fc43      fedora                                                                  1.0 MiB
 cmake-data                      noarch 3.31.6-4.fc43     fedora                                                                  8.5 MiB
 cmake-filesystem                x86_64 3.31.6-4.fc43     fedora                                                                  0.0   B
 cmake-rpm-macros                noarch 3.31.6-4.fc43     fedora                                                                  7.7 KiB
 cpp                             x86_64 15.2.1-2.fc43     fedora                                                                 37.9 MiB
 cuda-cccl-12-9                  x86_64 12.9.27-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64  12.7 MiB
 cuda-cccl-13-0                  x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  13.2 MiB
 cuda-crt-12-9                   x86_64 12.9.86-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 928.8 KiB
 cuda-crt-13-0                   x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 936.8 KiB
 cuda-cudart-12-9                x86_64 12.9.79-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 785.8 KiB
 cuda-cudart-13-0                x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 754.1 KiB
 cuda-cudart-devel-12-9          x86_64 12.9.79-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   8.5 MiB
 cuda-cudart-devel-13-0          x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   6.2 MiB
 cuda-culibos-devel-13-0         x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  96.4 KiB
 cuda-cuobjdump-12-9             x86_64 12.9.82-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 665.7 KiB
 cuda-cuobjdump-13-0             x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 750.4 KiB
 cuda-cuxxfilt-12-9              x86_64 12.9.82-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   1.0 MiB
 cuda-cuxxfilt-13-0              x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   1.0 MiB
 cuda-driver-devel-12-9          x86_64 12.9.79-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 131.0 KiB
 cuda-driver-devel-13-0          x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 135.3 KiB
 cuda-nvcc-12-9                  x86_64 12.9.86-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 317.8 MiB
 cuda-nvcc-13-0                  x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 111.0 MiB
 cuda-nvprune-12-9               x86_64 12.9.82-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 181.0 KiB
 cuda-nvprune-13-0               x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 181.3 KiB
 cuda-nvrtc-12-9                 x86_64 12.9.86-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 216.9 MiB
 cuda-nvrtc-13-0                 x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 217.4 MiB
 cuda-nvrtc-devel-12-9           x86_64 12.9.86-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 248.0 MiB
 cuda-nvrtc-devel-13-0           x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 244.5 MiB
 cuda-nvvm-12-9                  x86_64 12.9.86-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 132.6 MiB
 cuda-opencl-12-9                x86_64 12.9.19-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64  91.7 KiB
 cuda-opencl-13-0                x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  96.5 KiB
 cuda-opencl-devel-12-9          x86_64 12.9.19-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 741.1 KiB
 cuda-opencl-devel-13-0          x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 747.9 KiB
 cuda-profiler-api-12-9          x86_64 12.9.79-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64  73.4 KiB
 cuda-profiler-api-13-0          x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  77.6 KiB
 cuda-sandbox-devel-12-9         x86_64 12.9.19-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 146.3 KiB
 cuda-sandbox-devel-13-0         x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 149.4 KiB
 cuda-toolkit-12-9-config-common noarch 12.9.79-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   0.0   B
 cuda-toolkit-12-config-common   noarch 12.9.79-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64  44.0   B
 cuda-toolkit-13-0-config-common noarch 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   0.0   B
 cuda-toolkit-13-config-common   noarch 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  44.0   B
 cuda-toolkit-config-common      noarch 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  41.0   B
 emacs-filesystem                noarch 1:30.0-5.fc43     fedora                                                                  0.0   B
 expat                           x86_64 2.7.2-1.fc43      fedora                                                                298.6 KiB
 gcc                             x86_64 15.2.1-2.fc43     fedora                                                                111.9 MiB
 gcc-plugin-annobin              x86_64 15.2.1-2.fc43     fedora                                                                 57.2 KiB
 glibc-devel                     x86_64 2.42-4.fc43       fedora                                                                  2.3 MiB
 jsoncpp                         x86_64 1.9.6-2.fc43      fedora                                                                257.6 KiB
 kernel-headers                  x86_64 6.17.0-63.fc43    fedora                                                                  6.7 MiB
 libcublas-12-9                  x86_64 12.9.1.4-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 815.6 MiB
 libcublas-13-0                  x86_64 13.0.2.14-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 567.2 MiB
 libcublas-devel-12-9            x86_64 12.9.1.4-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   1.2 GiB
 libcublas-devel-13-0            x86_64 13.0.2.14-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 961.6 MiB
 libcufft-12-9                   x86_64 11.4.1.4-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 277.2 MiB
 libcufft-13-0                   x86_64 12.0.0.61-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 274.3 MiB
 libcufft-devel-12-9             x86_64 11.4.1.4-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 567.3 MiB
 libcufft-devel-13-0             x86_64 12.0.0.61-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 280.5 MiB
 libcufile-12-9                  x86_64 1.14.1.1-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   3.2 MiB
 libcufile-13-0                  x86_64 1.15.1.6-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   3.2 MiB
 libcufile-devel-12-9            x86_64 1.14.1.1-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64  27.9 MiB
 libcufile-devel-13-0            x86_64 1.15.1.6-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  27.9 MiB
 libcurand-12-9                  x86_64 10.3.10.19-1      https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 159.3 MiB
 libcurand-13-0                  x86_64 10.4.0.35-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 126.6 MiB
 libcurand-devel-12-9            x86_64 10.3.10.19-1      https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 161.3 MiB
 libcurand-devel-13-0            x86_64 10.4.0.35-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 129.0 MiB
 libcusolver-12-9                x86_64 11.7.5.82-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 470.6 MiB
 libcusolver-13-0                x86_64 12.0.4.66-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 233.8 MiB
 libcusolver-devel-12-9          x86_64 11.7.5.82-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 332.5 MiB
 libcusolver-devel-13-0          x86_64 12.0.4.66-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 180.9 MiB
 libcusparse-12-9                x86_64 12.5.10.65-1      https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 463.0 MiB
 libcusparse-13-0                x86_64 12.6.3.3-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 155.1 MiB
 libcusparse-devel-12-9          x86_64 12.5.10.65-1      https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 960.3 MiB
 libcusparse-devel-13-0          x86_64 12.6.3.3-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 348.7 MiB
 libmpc                          x86_64 1.3.1-8.fc43      fedora                                                                160.6 KiB
 libnpp-12-9                     x86_64 12.4.1.87-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 393.0 MiB
 libnpp-13-0                     x86_64 13.0.1.2-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 157.3 MiB
 libnpp-devel-12-9               x86_64 12.4.1.87-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 406.2 MiB
 libnpp-devel-13-0               x86_64 13.0.1.2-1        https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 184.5 MiB
 libnvfatbin-12-9                x86_64 12.9.82-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   2.4 MiB
 libnvfatbin-13-0                x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   2.4 MiB
 libnvfatbin-devel-12-9          x86_64 12.9.82-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   2.3 MiB
 libnvfatbin-devel-13-0          x86_64 13.0.85-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   2.3 MiB
 libnvjitlink-12-9               x86_64 12.9.86-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64  91.6 MiB
 libnvjitlink-13-0               x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  94.3 MiB
 libnvjitlink-devel-12-9         x86_64 12.9.86-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 127.6 MiB
 libnvjitlink-devel-13-0         x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 130.0 MiB
 libnvjpeg-12-9                  x86_64 12.4.0.76-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   9.0 MiB
 libnvjpeg-13-0                  x86_64 13.0.1.86-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   5.7 MiB
 libnvjpeg-devel-12-9            x86_64 12.4.0.76-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64   9.4 MiB
 libnvjpeg-devel-13-0            x86_64 13.0.1.86-1       https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64   6.4 MiB
 libnvptxcompiler-13-0           x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64  85.4 MiB
 libnvvm-13-0                    x86_64 13.0.88-1         https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 133.6 MiB
 libstdc++-devel                 x86_64 15.2.1-2.fc43     fedora                                                                 37.3 MiB
 libuv                           x86_64 1:1.51.0-2.fc43   fedora                                                                570.2 KiB
 libxcrypt-devel                 x86_64 4.4.38-8.fc43     fedora                                                                 30.8 KiB
 make                            x86_64 1:4.4.1-11.fc43   fedora                                                                  1.8 MiB
 mpdecimal                       x86_64 4.0.1-2.fc43      fedora                                                                217.2 KiB
 python-pip-wheel                noarch 25.1.1-16.fc43    fedora                                                                  1.2 MiB
 python3                         x86_64 3.14.0~rc3-1.fc43 fedora                                                                 28.9 KiB
 python3-libs                    x86_64 3.14.0~rc3-1.fc43 fedora                                                                 43.0 MiB
 rhash                           x86_64 1.4.5-3.fc43      fedora                                                                351.1 KiB
 tzdata                          noarch 2025b-3.fc43      fedora                                                                  1.6 MiB
 vim-filesystem                  noarch 2:9.1.1775-1.fc43 fedora                                                                 40.0   B

Transaction Summary:
 Installing:       114 packages

Total size of inbound packages is 7 GiB. Need to download 7 GiB.
After this operation, 12 GiB extra will be used (install 12 GiB, remove 0 B).
[  1/114] cmake-0:3.31.6-4.fc43.x86_64  100% | 210.9 MiB/s |  12.2 MiB |  00m00s
[  2/114] gcc14-c++-0:14.3.1-1.fc43.x86 100% | 137.0 MiB/s |  25.9 MiB |  00m00s
[  3/114] cuda-compiler-12-9-0:12.9.1-1 100% |  37.8 KiB/s |   7.4 KiB |  00m00s
[  4/114] gcc14-0:14.3.1-1.fc43.x86_64  100% | 149.2 MiB/s |  43.9 MiB |  00m00s
[  5/114] cuda-compiler-13-0-0:13.0.1-1 100% |  71.7 KiB/s |   7.5 KiB |  00m00s
[  6/114] cuda-libraries-devel-12-9-0:1 100% | 116.2 KiB/s |   7.9 KiB |  00m00s
[  7/114] cuda-libraries-devel-13-0-0:1 100% | 122.1 KiB/s |   7.9 KiB |  00m00s
[  8/114] gcc-c++-0:15.2.1-2.fc43.x86_6 100% | 200.7 MiB/s |  15.3 MiB |  00m00s
[  9/114] libmpc-0:1.3.1-8.fc43.x86_64  100% |  22.9 MiB/s |  70.4 KiB |  00m00s
[ 10/114] make-1:4.4.1-11.fc43.x86_64   100% | 114.3 MiB/s | 585.2 KiB |  00m00s
[ 11/114] cmake-data-0:3.31.6-4.fc43.no 100% | 117.5 MiB/s |   2.5 MiB |  00m00s
[ 12/114] cmake-filesystem-0:3.31.6-4.f 100% |   5.0 MiB/s |  15.5 KiB |  00m00s
[ 13/114] expat-0:2.7.2-1.fc43.x86_64   100% |  23.2 MiB/s | 118.9 KiB |  00m00s
[ 14/114] jsoncpp-0:1.9.6-2.fc43.x86_64 100% |  16.4 MiB/s | 101.1 KiB |  00m00s
[ 15/114] libuv-1:1.51.0-2.fc43.x86_64  100% |  52.0 MiB/s | 266.1 KiB |  00m00s
[ 16/114] rhash-0:1.4.5-3.fc43.x86_64   100% |  17.6 MiB/s | 197.9 KiB |  00m00s
[ 17/114] cuda-nvml-devel-12-9-0:12.9.7 100% | 953.4 KiB/s | 201.2 KiB |  00m00s
[ 18/114] cuda-nvml-devel-13-0-0:13.0.8 100% | 915.8 KiB/s | 218.9 KiB |  00m00s
[ 19/114] cuda-cuobjdump-12-9-0:12.9.82 100% |   1.8 MiB/s | 277.9 KiB |  00m00s
[ 20/114] cuda-cuxxfilt-12-9-0:12.9.82- 100% |   1.4 MiB/s | 282.8 KiB |  00m00s
[ 21/114] cuda-nvprune-12-9-0:12.9.82-1 100% | 296.9 KiB/s |  76.0 KiB |  00m00s
[ 22/114] cuda-crt-13-0-0:13.0.88-1.x86 100% | 260.6 KiB/s | 120.9 KiB |  00m00s
[ 23/114] cuda-cuobjdump-13-0-0:13.0.85 100% | 753.0 KiB/s | 309.5 KiB |  00m00s
[ 24/114] cuda-cuxxfilt-13-0-0:13.0.85- 100% |   1.5 MiB/s | 283.6 KiB |  00m00s
[ 25/114] cuda-nvprune-13-0-0:13.0.85-1 100% | 503.8 KiB/s |  76.6 KiB |  00m00s
[ 26/114] cuda-nvcc-13-0-0:13.0.88-1.x8 100% |  57.7 MiB/s |  35.3 MiB |  00m01s
[ 27/114] libnvptxcompiler-13-0-0:13.0. 100% |  44.9 MiB/s |  21.3 MiB |  00m00s
[ 28/114] cuda-cccl-12-9-0:12.9.27-1.x8 100% |  10.4 MiB/s |   1.7 MiB |  00m00s
[ 29/114] cuda-nvcc-12-9-0:12.9.86-1.x8 100% |  59.3 MiB/s | 111.3 MiB |  00m02s
[ 30/114] cuda-driver-devel-12-9-0:12.9 100% | 683.7 KiB/s |  43.1 KiB |  00m00s
[ 31/114] cuda-cudart-devel-12-9-0:12.9 100% |   7.5 MiB/s |   3.0 MiB |  00m00s
[ 32/114] libnvvm-13-0-0:13.0.88-1.x86_ 100% |  77.3 MiB/s |  58.3 MiB |  00m01s
[ 33/114] cuda-opencl-devel-12-9-0:12.9 100% | 823.7 KiB/s | 119.4 KiB |  00m00s
[ 34/114] cuda-profiler-api-12-9-0:12.9 100% | 391.5 KiB/s |  26.2 KiB |  00m00s
[ 35/114] cuda-sandbox-devel-12-9-0:12. 100% | 631.9 KiB/s |  44.2 KiB |  00m00s
[ 36/114] cuda-nvrtc-devel-12-9-0:12.9. 100% |  60.7 MiB/s |  74.2 MiB |  00m01s
[ 37/114] libcufile-devel-12-9-0:1.14.1 100% |  25.4 MiB/s |   5.2 MiB |  00m00s
[ 38/114] libcurand-devel-12-9-0:10.3.1 100% |  50.1 MiB/s |  64.2 MiB |  00m01s
[ 39/114] libcusolver-devel-12-9-0:11.7 100% |  56.7 MiB/s | 213.1 MiB |  00m04s
[ 40/114] libcufft-devel-12-9-0:11.4.1. 100% |  57.7 MiB/s | 385.6 MiB |  00m07s
[ 41/114] libcublas-devel-12-9-0:12.9.1 100% |  59.5 MiB/s | 630.3 MiB |  00m11s
[ 42/114] libnvfatbin-devel-12-9-0:12.9 100% |   2.8 MiB/s | 863.8 KiB |  00m00s
[ 43/114] libnvjitlink-devel-12-9-0:12. 100% |  54.4 MiB/s |  36.1 MiB |  00m01s
[ 44/114] libnpp-devel-12-9-0:12.4.1.87 100% |  52.4 MiB/s | 268.0 MiB |  00m05s
[ 45/114] libnvjpeg-devel-12-9-0:12.4.0 100% |  12.9 MiB/s |   4.9 MiB |  00m00s
[ 46/114] cuda-cccl-13-0-0:13.0.85-1.x8 100% |  10.6 MiB/s |   1.7 MiB |  00m00s
[ 47/114] cuda-culibos-devel-13-0-0:13. 100% | 478.0 KiB/s |  32.5 KiB |  00m00s
[ 48/114] cuda-cudart-devel-13-0-0:13.0 100% |  11.8 MiB/s |   1.9 MiB |  00m00s
[ 49/114] cuda-driver-devel-13-0-0:13.0 100% | 475.8 KiB/s |  44.3 KiB |  00m00s
[ 50/114] cuda-opencl-devel-13-0-0:13.0 100% | 875.2 KiB/s | 120.8 KiB |  00m00s
[ 51/114] cuda-profiler-api-13-0-0:13.0 100% | 360.8 KiB/s |  27.1 KiB |  00m00s
[ 52/114] cuda-sandbox-devel-13-0-0:13. 100% | 666.7 KiB/s |  45.3 KiB |  00m00s
[ 53/114] cuda-nvrtc-devel-13-0-0:13.0. 100% |  50.7 MiB/s |  73.7 MiB |  00m01s
[ 54/114] libcufft-devel-13-0-0:12.0.0. 100% |  57.4 MiB/s | 205.4 MiB |  00m04s
[ 55/114] libcufile-devel-13-0-0:1.15.1 100% |  14.8 MiB/s |   5.2 MiB |  00m00s
[ 56/114] libcusparse-devel-12-9-0:12.5 100% |  58.8 MiB/s | 710.9 MiB |  00m12s
[ 57/114] libcurand-devel-13-0-0:10.4.0 100% |  34.5 MiB/s |  56.0 MiB |  00m02s
[ 58/114] libcusolver-devel-13-0-0:12.0 100% |  55.0 MiB/s | 124.4 MiB |  00m02s
[ 59/114] libcublas-devel-13-0-0:13.0.2 100% |  54.3 MiB/s | 470.7 MiB |  00m09s
[ 60/114] libnvfatbin-devel-13-0-0:13.0 100% |   4.6 MiB/s | 877.4 KiB |  00m00s
[ 61/114] libnvjitlink-devel-13-0-0:13. 100% |  46.8 MiB/s |  36.7 MiB |  00m01s
[ 62/114] libnvjpeg-devel-13-0-0:13.0.1 100% |  15.8 MiB/s |   3.4 MiB |  00m00s
[ 63/114] gcc-0:15.2.1-2.fc43.x86_64    100% | 252.9 MiB/s |  39.7 MiB |  00m00s
[ 64/114] emacs-filesystem-1:30.0-5.fc4 100% |   2.4 MiB/s |   7.5 KiB |  00m00s
[ 65/114] vim-filesystem-2:9.1.1775-1.f 100% |   3.8 MiB/s |  15.4 KiB |  00m00s
[ 66/114] cuda-crt-12-9-0:12.9.86-1.x86 100% | 814.1 KiB/s | 119.7 KiB |  00m00s
[ 67/114] libnpp-devel-13-0-0:13.0.1.2- 100% |  58.7 MiB/s | 125.6 MiB |  00m02s
[ 68/114] cuda-cudart-12-9-0:12.9.79-1. 100% |   1.7 MiB/s | 236.8 KiB |  00m00s
[ 69/114] libcusparse-devel-13-0-0:12.6 100% |  61.0 MiB/s | 286.7 MiB |  00m05s
[ 70/114] cuda-nvvm-12-9-0:12.9.86-1.x8 100% |  45.7 MiB/s |  57.6 MiB |  00m01s
[ 71/114] cuda-opencl-12-9-0:12.9.19-1. 100% | 503.6 KiB/s |  34.2 KiB |  00m00s
[ 72/114] cuda-nvrtc-12-9-0:12.9.86-1.x 100% |  52.5 MiB/s |  84.8 MiB |  00m02s
[ 73/114] libcufile-12-9-0:1.14.1.1-1.x 100% |   7.5 MiB/s |   1.2 MiB |  00m00s
[ 74/114] libcurand-12-9-0:10.3.10.19-1 100% |  41.9 MiB/s |  63.9 MiB |  00m02s
[ 75/114] libcufft-12-9-0:11.4.1.4-1.x8 100% |  53.0 MiB/s | 191.7 MiB |  00m04s
[ 76/114] libcusolver-12-9-0:11.7.5.82- 100% |  59.2 MiB/s | 324.9 MiB |  00m05s
[ 77/114] libcublas-12-9-0:12.9.1.4-1.x 100% |  57.1 MiB/s | 555.4 MiB |  00m10s
[ 78/114] libcusparse-12-9-0:12.5.10.65 100% |  54.8 MiB/s | 351.7 MiB |  00m06s
[ 79/114] libnvfatbin-12-9-0:12.9.82-1. 100% | 886.9 KiB/s | 940.1 KiB |  00m01s
[ 80/114] libnvjpeg-12-9-0:12.4.0.76-1. 100% |   3.3 MiB/s |   5.1 MiB |  00m02s
[ 81/114] cuda-cudart-13-0-0:13.0.88-1. 100% | 929.7 KiB/s | 223.1 KiB |  00m00s
[ 82/114] libnvjitlink-12-9-0:12.9.86-1 100% |  14.1 MiB/s |  37.6 MiB |  00m03s
[ 83/114] cuda-opencl-13-0-0:13.0.85-1. 100% | 504.0 KiB/s |  35.3 KiB |  00m00s
[ 84/114] cuda-nvrtc-13-0-0:13.0.88-1.x 100% |  62.5 MiB/s |  85.4 MiB |  00m01s
[ 85/114] libnpp-12-9-0:12.4.1.87-1.x86 100% |  38.6 MiB/s | 271.1 MiB |  00m07s
[ 86/114] libcufile-13-0-0:1.15.1.6-1.x 100% | 692.1 KiB/s |   1.2 MiB |  00m02s
[ 87/114] libcurand-13-0-0:10.4.0.35-1. 100% |  41.2 MiB/s |  55.7 MiB |  00m01s
[ 88/114] libcufft-13-0-0:12.0.0.61-1.x 100% |  40.5 MiB/s | 204.4 MiB |  00m05s
[ 89/114] libcublas-13-0-0:13.0.2.14-1. 100% |  44.2 MiB/s | 401.1 MiB |  00m09s
[ 90/114] libcusolver-13-0-0:12.0.4.66- 100% |  39.9 MiB/s | 191.4 MiB |  00m05s
[ 91/114] libcusparse-13-0-0:12.6.3.3-1 100% |  36.5 MiB/s | 139.2 MiB |  00m04s
[ 92/114] libnvfatbin-13-0-0:13.0.85-1. 100% |   4.3 MiB/s | 950.0 KiB |  00m00s
[ 93/114] libnvjpeg-13-0-0:13.0.1.86-1. 100% |  16.3 MiB/s |   3.5 MiB |  00m00s
[ 94/114] cpp-0:15.2.1-2.fc43.x86_64    100% | 263.8 MiB/s |  12.9 MiB |  00m00s
[ 95/114] libstdc++-devel-0:15.2.1-2.fc 100% | 120.3 MiB/s |   5.3 MiB |  00m00s
[ 96/114] glibc-devel-0:2.42-4.fc43.x86 100% | 110.5 MiB/s | 565.9 KiB |  00m00s
[ 97/114] libxcrypt-devel-0:4.4.38-8.fc 100% |   9.5 MiB/s |  29.2 KiB |  00m00s
[ 98/114] cuda-toolkit-config-common-0: 100% | 102.0 KiB/s |   8.0 KiB |  00m00s
[ 99/114] cuda-toolkit-13-0-config-comm 100% | 103.6 KiB/s |   7.8 KiB |  00m00s
[100/114] libnvjitlink-13-0-0:13.0.88-1 100% |  60.9 MiB/s |  38.5 MiB |  00m01s
[101/114] cuda-toolkit-13-config-common 100% | 110.7 KiB/s |   8.0 KiB |  00m00s
[102/114] cuda-toolkit-12-9-config-comm 100% | 114.2 KiB/s |   7.8 KiB |  00m00s
[103/114] kernel-headers-0:6.17.0-63.fc 100% | 188.6 MiB/s |   1.7 MiB |  00m00s
[104/114] annobin-plugin-gcc-0:12.99-1. 100% | 138.9 MiB/s | 996.0 KiB |  00m00s
[105/114] gcc-plugin-annobin-0:15.2.1-2 100% |  18.6 MiB/s |  57.1 KiB |  00m00s
[106/114] annobin-docs-0:12.99-1.fc43.n 100% |  17.5 MiB/s |  89.5 KiB |  00m00s
[107/114] cmake-rpm-macros-0:3.31.6-4.f 100% |   7.2 MiB/s |  14.8 KiB |  00m00s
[108/114] python3-0:3.14.0~rc3-1.fc43.x 100% |  13.5 MiB/s |  27.6 KiB |  00m00s
[109/114] python3-libs-0:3.14.0~rc3-1.f 100% | 233.9 MiB/s |   9.8 MiB |  00m00s
[110/114] mpdecimal-0:4.0.1-2.fc43.x86_ 100% |  23.7 MiB/s |  97.1 KiB |  00m00s
[111/114] python-pip-wheel-0:25.1.1-16. 100% | 200.8 MiB/s |   1.2 MiB |  00m00s
[112/114] tzdata-0:2025b-3.fc43.noarch  100% |  77.5 MiB/s | 713.9 KiB |  00m00s
[113/114] cuda-toolkit-12-config-common 100% |  45.8 KiB/s |   8.0 KiB |  00m00s
[114/114] libnpp-13-0-0:13.0.1.2-1.x86_ 100% |  63.6 MiB/s | 127.8 MiB |  00m02s
--------------------------------------------------------------------------------
[114/114] Total                         100% | 146.3 MiB/s |   7.2 GiB |  00m50s
Running transaction
[  1/116] Verify package files          100% |   2.0   B/s | 114.0   B |  00m52s
[  2/116] Prepare transaction           100% |   1.7 KiB/s | 114.0   B |  00m00s
[  3/116] Installing cuda-toolkit-confi 100% | 304.7 KiB/s | 312.0   B |  00m00s
[  4/116] Installing cuda-toolkit-12-co 100% |   0.0   B/s | 316.0   B |  00m00s
[  5/116] Installing cuda-toolkit-12-9- 100% |   0.0   B/s | 124.0   B |  00m00s
[  6/116] Installing cuda-toolkit-13-co 100% |   0.0   B/s | 316.0   B |  00m00s
[  7/116] Installing cuda-toolkit-13-0- 100% |   0.0   B/s | 124.0   B |  00m00s
[  8/116] Installing cuda-culibos-devel 100% |   0.0   B/s |  97.0 KiB |  00m00s
[  9/116] Installing libmpc-0:1.3.1-8.f 100% |  79.1 MiB/s | 162.1 KiB |  00m00s
[ 10/116] Installing make-1:4.4.1-11.fc 100% |  72.0 MiB/s |   1.8 MiB |  00m00s
[ 11/116] Installing libstdc++-devel-0: 100% | 451.6 MiB/s |  37.5 MiB |  00m00s
[ 12/116] Installing cuda-cccl-13-0-0:1 100% | 212.3 MiB/s |  13.6 MiB |  00m00s
[ 13/116] Installing cuda-cccl-12-9-0:1 100% | 111.6 MiB/s |  13.1 MiB |  00m00s
[ 14/116] Installing libnvvm-13-0-0:13. 100% |  65.1 MiB/s | 133.6 MiB |  00m02s
[ 15/116] Installing libnvptxcompiler-1 100% |  76.0 MiB/s |  85.4 MiB |  00m01s
[ 16/116] Installing cuda-crt-13-0-0:13 100% | 153.3 MiB/s | 942.2 KiB |  00m00s
[ 17/116] Installing expat-0:2.7.2-1.fc 100% |  19.6 MiB/s | 300.7 KiB |  00m00s
[ 18/116] Installing cmake-filesystem-0 100% |   7.4 MiB/s |   7.6 KiB |  00m00s
[ 19/116] Installing cpp-0:15.2.1-2.fc4 100% | 341.9 MiB/s |  38.0 MiB |  00m00s
[ 20/116] Installing cuda-sandbox-devel 100% | 148.2 MiB/s | 151.7 KiB |  00m00s
[ 21/116] Installing cuda-cudart-13-0-0 100% |  82.0 MiB/s | 755.6 KiB |  00m00s
[ 22/116] Installing cuda-cudart-devel- 100% | 298.2 MiB/s |   6.3 MiB |  00m00s
[ 23/116] Installing cuda-opencl-13-0-0 100% |  19.2 MiB/s |  98.1 KiB |  00m00s
[ 24/116] Installing cuda-opencl-devel- 100% | 244.6 MiB/s | 751.3 KiB |  00m00s
[ 25/116] Installing libcublas-13-0-0:1 100% | 132.9 MiB/s | 567.2 MiB |  00m04s
[ 26/116] Installing libcublas-devel-13 100% |  60.1 MiB/s | 961.6 MiB |  00m16s
[ 27/116] Installing libcufft-13-0-0:12 100% | 187.6 MiB/s | 274.3 MiB |  00m01s
[ 28/116] Installing libcufft-devel-13- 100% |  42.5 MiB/s | 280.5 MiB |  00m07s
[ 29/116] Installing libcufile-13-0-0:1 100% | 169.0 MiB/s |   3.2 MiB |  00m00s
[ 30/116] Installing libcufile-devel-13 100% | 172.3 MiB/s |  27.9 MiB |  00m00s
[ 31/116] Installing libcurand-13-0-0:1 100% | 368.1 MiB/s | 126.6 MiB |  00m00s
[ 32/116] Installing libcurand-devel-13 100% |  61.5 MiB/s | 129.0 MiB |  00m02s
[ 33/116] Installing libcusolver-13-0-0 100% | 190.6 MiB/s | 233.8 MiB |  00m01s
[ 34/116] Installing libcusolver-devel- 100% |  42.7 MiB/s | 180.9 MiB |  00m04s
[ 35/116] Installing libcusparse-13-0-0 100% | 281.5 MiB/s | 155.1 MiB |  00m01s
[ 36/116] Installing libcusparse-devel- 100% |  41.3 MiB/s | 348.7 MiB |  00m08s
[ 37/116] Installing libnpp-13-0-0:13.0 100% | 315.3 MiB/s | 157.4 MiB |  00m00s
[ 38/116] Installing libnpp-devel-13-0- 100% |  52.4 MiB/s | 184.5 MiB |  00m04s
[ 39/116] Installing libnvfatbin-13-0-0 100% | 161.3 MiB/s |   2.4 MiB |  00m00s
[ 40/116] Installing libnvfatbin-devel- 100% | 101.8 MiB/s |   2.3 MiB |  00m00s
[ 41/116] Installing libnvjitlink-13-0- 100% | 250.1 MiB/s |  94.3 MiB |  00m00s
[ 42/116] Installing libnvjitlink-devel 100% |  84.0 MiB/s | 130.0 MiB |  00m02s
[ 43/116] Installing libnvjpeg-13-0-0:1 100% | 217.9 MiB/s |   5.7 MiB |  00m00s
[ 44/116] Installing libnvjpeg-devel-13 100% |  22.5 MiB/s |   6.4 MiB |  00m00s
[ 45/116] Installing cuda-sandbox-devel 100% |  72.6 MiB/s | 148.6 KiB |  00m00s
[ 46/116] Installing cuda-cudart-12-9-0 100% |  59.1 MiB/s | 787.3 KiB |  00m00s
[ 47/116] Installing cuda-cudart-devel- 100% |  94.2 MiB/s |   8.5 MiB |  00m00s
[ 48/116] Installing cuda-opencl-12-9-0 100% |  15.2 MiB/s |  93.4 KiB |  00m00s
[ 49/116] Installing cuda-opencl-devel- 100% | 181.8 MiB/s | 744.4 KiB |  00m00s
[ 50/116] Installing libcublas-12-9-0:1 100% |  48.2 MiB/s | 815.6 MiB |  00m17s
[ 51/116] Installing libcublas-devel-12 100% |  58.2 MiB/s |   1.2 GiB |  00m21s
[ 52/116] Installing libcufft-12-9-0:11 100% |  45.0 MiB/s | 277.2 MiB |  00m06s
[ 53/116] Installing libcufft-devel-12- 100% |  47.9 MiB/s | 567.3 MiB |  00m12s
[ 54/116] Installing libcufile-12-9-0:1 100% |  15.8 MiB/s |   3.2 MiB |  00m00s
[ 55/116] Installing libcufile-devel-12 100% | 120.3 MiB/s |  27.9 MiB |  00m00s
[ 56/116] Installing libcurand-12-9-0:1 100% |  83.7 MiB/s | 159.3 MiB |  00m02s
[ 57/116] Installing libcurand-devel-12 100% |  79.4 MiB/s | 161.3 MiB |  00m02s
[ 58/116] Installing libcusolver-12-9-0 100% |  43.9 MiB/s | 470.6 MiB |  00m11s
[ 59/116] Installing libcusolver-devel- 100% |  49.4 MiB/s | 332.5 MiB |  00m07s
[ 60/116] Installing libcusparse-12-9-0 100% |  42.4 MiB/s | 463.0 MiB |  00m11s
[ 61/116] Installing libcusparse-devel- 100% |  42.9 MiB/s | 960.3 MiB |  00m22s
[ 62/116] Installing libnpp-12-9-0:12.4 100% |  45.3 MiB/s | 393.0 MiB |  00m09s
[ 63/116] Installing libnpp-devel-12-9- 100% |  46.0 MiB/s | 406.2 MiB |  00m09s
[ 64/116] Installing libnvfatbin-12-9-0 100% |  70.5 MiB/s |   2.4 MiB |  00m00s
[ 65/116] Installing libnvfatbin-devel- 100% |  96.2 MiB/s |   2.3 MiB |  00m00s
[ 66/116] Installing libnvjitlink-12-9- 100% |  76.7 MiB/s |  91.6 MiB |  00m01s
[ 67/116] Installing libnvjitlink-devel 100% | 102.6 MiB/s | 127.6 MiB |  00m01s
[ 68/116] Installing libnvjpeg-12-9-0:1 100% |  59.5 MiB/s |   9.0 MiB |  00m00s
[ 69/116] Installing libnvjpeg-devel-12 100% |  44.7 MiB/s |   9.4 MiB |  00m00s
[ 70/116] Installing tzdata-0:2025b-3.f 100% |  24.9 MiB/s |   1.9 MiB |  00m00s
[ 71/116] Installing python-pip-wheel-0 100% | 177.9 MiB/s |   1.2 MiB |  00m00s
[ 72/116] Installing mpdecimal-0:4.0.1- 100% |  16.4 MiB/s | 218.8 KiB |  00m00s
[ 73/116] Installing python3-libs-0:3.1 100% |  68.6 MiB/s |  43.3 MiB |  00m01s
[ 74/116] Installing python3-0:3.14.0~r 100% | 309.8 KiB/s |  30.7 KiB |  00m00s
[ 75/116] Installing cmake-rpm-macros-0 100% |   2.7 MiB/s |   8.3 KiB |  00m00s
[ 76/116] Installing annobin-docs-0:12. 100% |  12.2 MiB/s | 100.1 KiB |  00m00s
[ 77/116] Installing kernel-headers-0:6 100% | 191.0 MiB/s |   6.9 MiB |  00m00s
[ 78/116] Installing glibc-devel-0:2.42 100% | 147.1 MiB/s |   2.4 MiB |  00m00s
[ 79/116] Installing libxcrypt-devel-0: 100% |   3.2 MiB/s |  33.1 KiB |  00m00s
[ 80/116] Installing gcc-0:15.2.1-2.fc4 100% |  82.2 MiB/s | 111.9 MiB |  00m01s
[ 81/116] Installing gcc-c++-0:15.2.1-2 100% |  78.8 MiB/s |  41.4 MiB |  00m01s
[ 82/116] Installing cuda-nvcc-13-0-0:1 100% |  69.0 MiB/s | 111.0 MiB |  00m02s
[ 83/116] Installing gcc14-0:14.3.1-1.f 100% |  71.5 MiB/s | 117.7 MiB |  00m02s
[ 84/116] Installing cuda-nvrtc-13-0-0: 100% |  66.2 MiB/s | 217.4 MiB |  00m03s
[ 85/116] Installing cuda-nvrtc-devel-1 100% |  79.6 MiB/s | 244.5 MiB |  00m03s
[ 86/116] Installing cuda-nvrtc-12-9-0: 100% |  71.1 MiB/s | 216.9 MiB |  00m03s
[ 87/116] Installing cuda-nvrtc-devel-1 100% |  86.8 MiB/s | 248.0 MiB |  00m03s
[ 88/116] Installing cuda-nvvm-12-9-0:1 100% |  60.4 MiB/s | 132.7 MiB |  00m02s
[ 89/116] Installing cuda-crt-12-9-0:12 100% | 114.0 MiB/s | 933.9 KiB |  00m00s
[ 90/116] Installing cuda-nvcc-12-9-0:1 100% |  89.0 MiB/s | 317.8 MiB |  00m04s
[ 91/116] Installing vim-filesystem-2:9 100% |   1.2 MiB/s |   4.7 KiB |  00m00s
[ 92/116] Installing emacs-filesystem-1 100% | 265.6 KiB/s | 544.0   B |  00m00s
[ 93/116] Installing cuda-profiler-api- 100% |  38.6 MiB/s |  79.1 KiB |  00m00s
[ 94/116] Installing cuda-driver-devel- 100% |  33.5 MiB/s | 137.0 KiB |  00m00s
[ 95/116] Installing cuda-profiler-api- 100% |  24.4 MiB/s |  74.9 KiB |  00m00s
[ 96/116] Installing cuda-driver-devel- 100% |  43.2 MiB/s | 132.8 KiB |  00m00s
[ 97/116] Installing cuda-nvprune-13-0- 100% |  88.9 MiB/s | 182.1 KiB |  00m00s
[ 98/116] Installing cuda-cuxxfilt-13-0 100% | 104.9 MiB/s |   1.0 MiB |  00m00s
[ 99/116] Installing cuda-cuobjdump-13- 100% |  81.5 MiB/s | 751.3 KiB |  00m00s
[100/116] Installing cuda-nvprune-12-9- 100% |  59.2 MiB/s | 181.8 KiB |  00m00s
[101/116] Installing cuda-cuxxfilt-12-9 100% |  95.0 MiB/s |   1.0 MiB |  00m00s
[102/116] Installing cuda-cuobjdump-12- 100% |  46.5 MiB/s | 666.6 KiB |  00m00s
[103/116] Installing rhash-0:1.4.5-3.fc 100% |  14.5 MiB/s | 356.4 KiB |  00m00s
[104/116] Installing libuv-1:1.51.0-2.f 100% |  46.6 MiB/s | 573.0 KiB |  00m00s
[105/116] Installing jsoncpp-0:1.9.6-2. 100% |  42.2 MiB/s | 259.2 KiB |  00m00s
[106/116] Installing cmake-0:3.31.6-4.f 100% |  75.3 MiB/s |  34.5 MiB |  00m00s
[107/116] Installing cmake-data-0:3.31. 100% |  71.4 MiB/s |   9.1 MiB |  00m00s
[108/116] Installing cuda-compiler-12-9 100% |   0.0   B/s | 124.0   B |  00m00s
[109/116] Installing cuda-compiler-13-0 100% |   0.0   B/s | 124.0   B |  00m00s
[110/116] Installing cuda-libraries-dev 100% |   0.0   B/s | 124.0   B |  00m00s
[111/116] Installing cuda-libraries-dev 100% |  20.2 KiB/s | 124.0   B |  00m00s
[112/116] Installing gcc14-c++-0:14.3.1 100% | 107.4 MiB/s | 124.2 MiB |  00m01s
[113/116] Installing annobin-plugin-gcc 100% |  75.9 MiB/s |   1.0 MiB |  00m00s
[114/116] Installing gcc-plugin-annobin 100% |   3.8 MiB/s |  58.6 KiB |  00m00s
[115/116] Installing cuda-nvml-devel-13 100% | 142.1 MiB/s |   1.4 MiB |  00m00s
[116/116] Installing cuda-nvml-devel-12 100% |   2.1 MiB/s |   1.4 MiB |  00m01s
Warning: skipped OpenPGP checks for 85 packages from repositories: https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64, https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64
Complete!
Finish: build setup for ollama-ggml-cuda-0.12.3-1.fc43.src.rpm
Start: rpmbuild ollama-ggml-cuda-0.12.3-1.fc43.src.rpm
Building target platforms: x86_64
Building for target x86_64
setting SOURCE_DATE_EPOCH=1759363200
Executing(%mkbuilddir): /bin/sh -e /var/tmp/rpm-tmp.H4cHMg
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.qhB4TX
+ umask 022
+ cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ rm -rf ollama-0.12.3
+ /usr/lib/rpm/rpmuncompress -x /builddir/build/SOURCES/v0.12.3.tar.gz
+ STATUS=0
+ '[' 0 -ne 0 ']'
+ cd ollama-0.12.3
+ /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w .
+ /usr/bin/patch -p1 -s --fuzz=0 --no-backup-if-mismatch -f
+ /usr/lib/rpm/rpmuncompress /builddir/build/SOURCES/remove-runtime-for-cuda-and-rocm.patch
+ /usr/bin/patch -p1 -s --fuzz=0 --no-backup-if-mismatch -f
+ /usr/lib/rpm/rpmuncompress /builddir/build/SOURCES/replace-library-paths.patch
+ cp -a /usr/local/cuda-12/ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/
+ patch -p1 -d /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/
patching file include/crt/math_functions.h
Hunk #1 succeeded at 2553 with fuzz 1.
Hunk #2 succeeded at 2576 with fuzz 1.
Hunk #3 succeeded at 2598 with fuzz 1.
patch unexpectedly ends in middle of line
Hunk #4 succeeded at 2620 with fuzz 1.
patching file include/crt/math_functions.h
Hunk #1 succeeded at 594 (offset -32 lines).
Hunk #2 succeeded at 622 (offset -32 lines).
+ patch -p1 -d /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/
+ cp -a /usr/local/cuda-13/ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/
+ patch -p1 -d /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/targets/x86_64-linux/
patching file include/crt/math_functions.h
+ RPM_EC=0
++ jobs -p
+ exit 0
Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.1Olv8r
+ umask 022
+ cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer  '
+ export CFLAGS
+ CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer  '
+ export CXXFLAGS
+ FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules  '
+ export FFLAGS
+ FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules  '
+ export FCFLAGS
+ VALAFLAGS=-g
+ export VALAFLAGS
+ RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn'
+ export RUSTFLAGS
+ LDFLAGS='-Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes  '
+ export LDFLAGS
+ LT_SYS_LIBRARY_PATH=/usr/lib64:
+ export LT_SYS_LIBRARY_PATH
+ CC=gcc
+ export CC
+ CXX=g++
+ export CXX
+ cd ollama-0.12.3
+ CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer  '
+ export CFLAGS
+ CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer  '
+ export CXXFLAGS
+ FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules  '
+ export FFLAGS
+ FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules  '
+ export FCFLAGS
+ VALAFLAGS=-g
+ export VALAFLAGS
+ RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn'
+ export RUSTFLAGS
+ LDFLAGS='-Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes  '
+ export LDFLAGS
+ LT_SYS_LIBRARY_PATH=/usr/lib64:
+ export LT_SYS_LIBRARY_PATH
+ CC=gcc
+ export CC
+ CXX=g++
+ export CXX
+ /usr/bin/cmake -S . -B redhat-linux-build_cuda-13 -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DCMAKE_INSTALL_FULL_SBINDIR:PATH=/usr/bin -DCMAKE_INSTALL_SBINDIR:PATH=bin -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON --preset 'CUDA 13' -DOLLAMA_RUNNER_DIR=cuda_v13 -DCMAKE_CUDA_COMPILER=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -DCMAKE_CUDA_FLAGS_RELEASE=-DNDEBUG '-DCMAKE_CUDA_FLAGS=-O2 -g -Xcompiler "-fPIC"'
Preset CMake variables:

  CMAKE_BUILD_TYPE="Release"
  CMAKE_CUDA_ARCHITECTURES="75-virtual;80-virtual;86-virtual;87-virtual;89-virtual;90-virtual;90a-virtual;100-virtual;110-virtual;120-virtual;121-virtual"
  CMAKE_MSVC_RUNTIME_LIBRARY="MultiThreaded"

-- The C compiler identification is GNU 15.2.1
-- The CXX compiler identification is GNU 15.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- GGML_SYSTEM_ARCH: x86
-- Including CPU backend
-- x86 detected
-- Adding CPU backend variant ggml-cpu-x64:  
-- x86 detected
-- Adding CPU backend variant ggml-cpu-sse42: -msse4.2 GGML_SSE42
-- x86 detected
-- Adding CPU backend variant ggml-cpu-sandybridge: -msse4.2;-mavx GGML_SSE42;GGML_AVX
-- x86 detected
-- Adding CPU backend variant ggml-cpu-haswell: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2 GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2
-- x86 detected
-- Adding CPU backend variant ggml-cpu-skylakex: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512
-- x86 detected
-- Adding CPU backend variant ggml-cpu-icelake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw;-mavx512vbmi;-mavx512vnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512;GGML_AVX512_VBMI;GGML_AVX512_VNNI
-- x86 detected
-- Adding CPU backend variant ggml-cpu-alderlake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavxvnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX_VNNI
-- Found CUDAToolkit: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/targets/x86_64-linux/include (found version "13.0.88")
-- CUDA Toolkit found
-- Using CUDA architectures: 75-virtual;80-virtual;86-virtual;87-virtual;89-virtual;90-virtual;90a-virtual;100-virtual;110-virtual;120-virtual;121-virtual
-- The CUDA compiler identification is NVIDIA 13.0.88 with host compiler GNU 15.2.1
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Looking for a HIP compiler
-- Looking for a HIP compiler - NOTFOUND
-- Configuring done (8.2s)
-- Generating done (0.0s)
CMake Warning:
  Manually-specified variables were not used by the project:

    CMAKE_Fortran_FLAGS_RELEASE
    CMAKE_INSTALL_DO_STRIP
    INCLUDE_INSTALL_DIR
    LIB_SUFFIX
    SHARE_INSTALL_PREFIX
    SYSCONF_INSTALL_DIR


-- Build files have been written to: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13
+ /usr/bin/cmake --build redhat-linux-build_cuda-13 -j4 --verbose --target ggml-cuda
Change Dir: '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'

Run Build Command(s): /usr/bin/cmake -E env VERBOSE=1 /usr/bin/gmake -f Makefile -j4 ggml-cuda
/usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/gmake  -f CMakeFiles/Makefile2 ggml-cuda
gmake[1]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
/usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/CMakeFiles 47
/usr/bin/gmake  -f CMakeFiles/Makefile2 ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/all
gmake[2]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/depend
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/DependInfo.cmake "--color="
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
[  2%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o
[  4%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o
[  4%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o
[  4%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o -MF CMakeFiles/ggml-base.dir/ggml.c.o.d -o CMakeFiles/ggml-base.dir/ggml.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o -MF CMakeFiles/ggml-base.dir/ggml.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o -MF CMakeFiles/ggml-base.dir/ggml-alloc.c.o.d -o CMakeFiles/ggml-base.dir/ggml-alloc.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-backend.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c:4:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5663:13: warning: ‘ggml_hash_map_free’ defined but not used [-Wunused-function]
 5663 | static void ggml_hash_map_free(struct hash_map * map) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5656:26: warning: ‘ggml_new_hash_map’ defined but not used [-Wunused-function]
 5656 | static struct hash_map * ggml_new_hash_map(size_t size) {
      |                          ^~~~~~~~~~~~~~~~~
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp:1:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
[  6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-opt.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp:14:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
[  6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-threading.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-threading.cpp
[  6%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o -MF CMakeFiles/ggml-base.dir/ggml-quants.c.o.d -o CMakeFiles/ggml-base.dir/ggml-quants.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp:6:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
[  8%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o -MF CMakeFiles/ggml-base.dir/gguf.cpp.o.d -o CMakeFiles/ggml-base.dir/gguf.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:4067:12: warning: ‘iq1_find_best_neighbour’ defined but not used [-Wunused-function]
 4067 | static int iq1_find_best_neighbour(const uint16_t * GGML_RESTRICT neighbours, const uint64_t * GGML_RESTRICT grid,
      |            ^~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:579:14: warning: ‘make_qkx1_quants’ defined but not used [-Wunused-function]
  579 | static float make_qkx1_quants(int n, int nmax, const float * GGML_RESTRICT x, uint8_t * GGML_RESTRICT L, float * GGML_RESTRICT the_min,
      |              ^~~~~~~~~~~~~~~~
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:5:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp:3:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
[  8%] Linking CXX shared library ../../../../../lib/ollama/libggml-base.so
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-base.dir/link.txt --verbose=1
/usr/bin/g++ -fPIC -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -Wl,--dependency-file=CMakeFiles/ggml-base.dir/link.d -Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared -Wl,-soname,libggml-base.so -o ../../../../../lib/ollama/libggml-base.so "CMakeFiles/ggml-base.dir/ggml.c.o" "CMakeFiles/ggml-base.dir/ggml.cpp.o" "CMakeFiles/ggml-base.dir/ggml-alloc.c.o" "CMakeFiles/ggml-base.dir/ggml-backend.cpp.o" "CMakeFiles/ggml-base.dir/ggml-opt.cpp.o" "CMakeFiles/ggml-base.dir/ggml-threading.cpp.o" "CMakeFiles/ggml-base.dir/ggml-quants.c.o" "CMakeFiles/ggml-base.dir/gguf.cpp.o"  -lm
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
[  8%] Built target ggml-base
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/depend
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/DependInfo.cmake "--color="
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
[  8%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o
[ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o
[ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o -MF CMakeFiles/ggml-cuda.dir/acc.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/acc.cu -o CMakeFiles/ggml-cuda.dir/acc.cu.o
[ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o -MF CMakeFiles/ggml-cuda.dir/add-id.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/add-id.cu -o CMakeFiles/ggml-cuda.dir/add-id.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o -MF CMakeFiles/ggml-cuda.dir/arange.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/arange.cu -o CMakeFiles/ggml-cuda.dir/arange.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o -MF CMakeFiles/ggml-cuda.dir/argmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argmax.cu -o CMakeFiles/ggml-cuda.dir/argmax.cu.o
[ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o -MF CMakeFiles/ggml-cuda.dir/argsort.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argsort.cu -o CMakeFiles/ggml-cuda.dir/argsort.cu.o
[ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o -MF CMakeFiles/ggml-cuda.dir/binbcast.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/binbcast.cu -o CMakeFiles/ggml-cuda.dir/binbcast.cu.o
[ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o -MF CMakeFiles/ggml-cuda.dir/clamp.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/clamp.cu -o CMakeFiles/ggml-cuda.dir/clamp.cu.o
[ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o -MF CMakeFiles/ggml-cuda.dir/concat.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/concat.cu -o CMakeFiles/ggml-cuda.dir/concat.cu.o
[ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o -MF CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv-transpose-1d.cu -o CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o
[ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-dw.cu -o CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o
[ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-transpose.cu -o CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o
[ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o -MF CMakeFiles/ggml-cuda.dir/convert.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/convert.cu -o CMakeFiles/ggml-cuda.dir/convert.cu.o
[ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o -MF CMakeFiles/ggml-cuda.dir/count-equal.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/count-equal.cu -o CMakeFiles/ggml-cuda.dir/count-equal.cu.o
[ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o -MF CMakeFiles/ggml-cuda.dir/cpy.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cpy.cu -o CMakeFiles/ggml-cuda.dir/cpy.cu.o
[ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o -MF CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cross-entropy-loss.cu -o CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o
[ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o -MF CMakeFiles/ggml-cuda.dir/diagmask.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/diagmask.cu -o CMakeFiles/ggml-cuda.dir/diagmask.cu.o
[ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o
[ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f32.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o
[ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-wmma-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o
[ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn.cu -o CMakeFiles/ggml-cuda.dir/fattn.cu.o
[ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o -MF CMakeFiles/ggml-cuda.dir/getrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/getrows.cu -o CMakeFiles/ggml-cuda.dir/getrows.cu.o
[ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o -MF CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu -o CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o
[ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o -MF CMakeFiles/ggml-cuda.dir/gla.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/gla.cu -o CMakeFiles/ggml-cuda.dir/gla.cu.o
[ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o -MF CMakeFiles/ggml-cuda.dir/im2col.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/im2col.cu -o CMakeFiles/ggml-cuda.dir/im2col.cu.o
[ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o -MF CMakeFiles/ggml-cuda.dir/mean.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mean.cu -o CMakeFiles/ggml-cuda.dir/mean.cu.o
[ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmf.cu -o CMakeFiles/ggml-cuda.dir/mmf.cu.o
[ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmq.cu -o CMakeFiles/ggml-cuda.dir/mmq.cu.o
[ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvf.cu -o CMakeFiles/ggml-cuda.dir/mmvf.cu.o
[ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvq.cu -o CMakeFiles/ggml-cuda.dir/mmvq.cu.o
[ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o -MF CMakeFiles/ggml-cuda.dir/norm.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/norm.cu -o CMakeFiles/ggml-cuda.dir/norm.cu.o
[ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o -MF CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/opt-step-adamw.cu -o CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o
[ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o -MF CMakeFiles/ggml-cuda.dir/out-prod.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/out-prod.cu -o CMakeFiles/ggml-cuda.dir/out-prod.cu.o
[ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o -MF CMakeFiles/ggml-cuda.dir/pad.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pad.cu -o CMakeFiles/ggml-cuda.dir/pad.cu.o
[ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o -MF CMakeFiles/ggml-cuda.dir/pool2d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pool2d.cu -o CMakeFiles/ggml-cuda.dir/pool2d.cu.o
[ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o -MF CMakeFiles/ggml-cuda.dir/quantize.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/quantize.cu -o CMakeFiles/ggml-cuda.dir/quantize.cu.o
[ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o -MF CMakeFiles/ggml-cuda.dir/roll.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/roll.cu -o CMakeFiles/ggml-cuda.dir/roll.cu.o
[ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o -MF CMakeFiles/ggml-cuda.dir/rope.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/rope.cu -o CMakeFiles/ggml-cuda.dir/rope.cu.o
[ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o -MF CMakeFiles/ggml-cuda.dir/scale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/scale.cu -o CMakeFiles/ggml-cuda.dir/scale.cu.o
[ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o -MF CMakeFiles/ggml-cuda.dir/set-rows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/set-rows.cu -o CMakeFiles/ggml-cuda.dir/set-rows.cu.o
[ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o -MF CMakeFiles/ggml-cuda.dir/softcap.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softcap.cu -o CMakeFiles/ggml-cuda.dir/softcap.cu.o
[ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o -MF CMakeFiles/ggml-cuda.dir/softmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softmax.cu -o CMakeFiles/ggml-cuda.dir/softmax.cu.o
[ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-conv.cu -o CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o
[ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-scan.cu -o CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o
[ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o -MF CMakeFiles/ggml-cuda.dir/sum.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sum.cu -o CMakeFiles/ggml-cuda.dir/sum.cu.o
[ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o -MF CMakeFiles/ggml-cuda.dir/sumrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sumrows.cu -o CMakeFiles/ggml-cuda.dir/sumrows.cu.o
[ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o -MF CMakeFiles/ggml-cuda.dir/tsembd.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/tsembd.cu -o CMakeFiles/ggml-cuda.dir/tsembd.cu.o
[ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o -MF CMakeFiles/ggml-cuda.dir/unary.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/unary.cu -o CMakeFiles/ggml-cuda.dir/unary.cu.o
[ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o -MF CMakeFiles/ggml-cuda.dir/upscale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/upscale.cu -o CMakeFiles/ggml-cuda.dir/upscale.cu.o
[ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o -MF CMakeFiles/ggml-cuda.dir/wkv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/wkv.cu -o CMakeFiles/ggml-cuda.dir/wkv.cu.o
[ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o
[ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
[ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o
[ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
[ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o
[ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o
[ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
[ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
[ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o
[ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o
[ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o
[ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o
[ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o
[ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o
[ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o
[ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o
[ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
[ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o
[ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o
[ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq1_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o
[ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o
[ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o
[ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o
[ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o
[ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o
[ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_nl.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o
[ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o
[ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-mxfp4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o
[ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q2_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o
[ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q3_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o
[ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o
[ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o
[ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o
[ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o
[ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o
[ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o
[ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q6_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o
[ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o
[ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o
[ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o
[ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o
[ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o
[ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o
[ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o
[ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
[ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o
[ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o
[ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o
[100%] Linking CUDA shared module ../../../../../../lib/ollama/libggml-cuda.so
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-cuda.dir/link.txt --verbose=1
/usr/bin/g++ -fPIC -Wl,--dependency-file=CMakeFiles/ggml-cuda.dir/link.d -Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared  -o ../../../../../../lib/ollama/libggml-cuda.so @CMakeFiles/ggml-cuda.dir/objects1.rsp @CMakeFiles/ggml-cuda.dir/linkLibs.rsp -L"/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/targets/x86_64-linux/lib/stubs" -L"/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/targets/x86_64-linux/lib"
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
[100%] Built target ggml-cuda
gmake[2]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'
/usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/CMakeFiles 0
gmake[1]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13'

+ CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer  '
+ export CFLAGS
+ CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer  '
+ export CXXFLAGS
+ FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules  '
+ export FFLAGS
+ FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules  '
+ export FCFLAGS
+ VALAFLAGS=-g
+ export VALAFLAGS
+ RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn'
+ export RUSTFLAGS
+ LDFLAGS='-Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes  '
+ export LDFLAGS
+ LT_SYS_LIBRARY_PATH=/usr/lib64:
+ export LT_SYS_LIBRARY_PATH
+ CC=gcc
+ export CC
+ CXX=g++
+ export CXX
+ /usr/bin/cmake -S . -B redhat-linux-build_cuda-12 -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DCMAKE_INSTALL_FULL_SBINDIR:PATH=/usr/bin -DCMAKE_INSTALL_SBINDIR:PATH=bin -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON --preset 'CUDA 12' -DOLLAMA_RUNNER_DIR=cuda_v12 -DCMAKE_CUDA_COMPILER=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -DCMAKE_CUDA_HOST_COMPILER=g++-14 -DCMAKE_CUDA_FLAGS_RELEASE=-DNDEBUG '-DCMAKE_CUDA_FLAGS=-O2 -g -Xcompiler "-fPIC"'
Preset CMake variables:

  CMAKE_BUILD_TYPE="Release"
  CMAKE_CUDA_ARCHITECTURES="50;60;61;70;75;80;86;87;89;90;90a;120"
  CMAKE_MSVC_RUNTIME_LIBRARY="MultiThreaded"

-- The C compiler identification is GNU 15.2.1
-- The CXX compiler identification is GNU 15.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- GGML_SYSTEM_ARCH: x86
-- Including CPU backend
-- x86 detected
-- Adding CPU backend variant ggml-cpu-x64:  
-- x86 detected
-- Adding CPU backend variant ggml-cpu-sse42: -msse4.2 GGML_SSE42
-- x86 detected
-- Adding CPU backend variant ggml-cpu-sandybridge: -msse4.2;-mavx GGML_SSE42;GGML_AVX
-- x86 detected
-- Adding CPU backend variant ggml-cpu-haswell: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2 GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2
-- x86 detected
-- Adding CPU backend variant ggml-cpu-skylakex: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512
-- x86 detected
-- Adding CPU backend variant ggml-cpu-icelake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw;-mavx512vbmi;-mavx512vnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512;GGML_AVX512_VBMI;GGML_AVX512_VNNI
-- x86 detected
-- Adding CPU backend variant ggml-cpu-alderlake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavxvnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX_VNNI
-- Found CUDAToolkit: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/include (found version "12.9.86")
-- CUDA Toolkit found
-- Using CUDA architectures: 50;60;61;70;75;80;86;87;89;90;90a;120
-- The CUDA compiler identification is NVIDIA 12.9.86 with host compiler GNU 14.3.1
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Looking for a HIP compiler
-- Looking for a HIP compiler - NOTFOUND
-- Configuring done (5.7s)
-- Generating done (0.0s)
CMake Warning:
  Manually-specified variables were not used by the project:

    CMAKE_Fortran_FLAGS_RELEASE
    CMAKE_INSTALL_DO_STRIP
    INCLUDE_INSTALL_DIR
    LIB_SUFFIX
    SHARE_INSTALL_PREFIX
    SYSCONF_INSTALL_DIR


-- Build files have been written to: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12
+ /usr/bin/cmake --build redhat-linux-build_cuda-12 -j4 --verbose --target ggml-cuda
Change Dir: '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'

Run Build Command(s): /usr/bin/cmake -E env VERBOSE=1 /usr/bin/gmake -f Makefile -j4 ggml-cuda
/usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/gmake  -f CMakeFiles/Makefile2 ggml-cuda
gmake[1]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
/usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/CMakeFiles 47
/usr/bin/gmake  -f CMakeFiles/Makefile2 ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/all
gmake[2]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/depend
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/DependInfo.cmake "--color="
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
[  2%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o
[  4%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o
[  4%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o
[  4%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o -MF CMakeFiles/ggml-base.dir/ggml.c.o.d -o CMakeFiles/ggml-base.dir/ggml.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o -MF CMakeFiles/ggml-base.dir/ggml.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o -MF CMakeFiles/ggml-base.dir/ggml-alloc.c.o.d -o CMakeFiles/ggml-base.dir/ggml-alloc.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-backend.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c:4:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5663:13: warning: ‘ggml_hash_map_free’ defined but not used [-Wunused-function]
 5663 | static void ggml_hash_map_free(struct hash_map * map) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5656:26: warning: ‘ggml_new_hash_map’ defined but not used [-Wunused-function]
 5656 | static struct hash_map * ggml_new_hash_map(size_t size) {
      |                          ^~~~~~~~~~~~~~~~~
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp:1:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
[  6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-opt.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp:14:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
[  6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-threading.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-threading.cpp
[  6%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o -MF CMakeFiles/ggml-base.dir/ggml-quants.c.o.d -o CMakeFiles/ggml-base.dir/ggml-quants.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp:6:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:4067:12: warning: ‘iq1_find_best_neighbour’ defined but not used [-Wunused-function]
 4067 | static int iq1_find_best_neighbour(const uint16_t * GGML_RESTRICT neighbours, const uint64_t * GGML_RESTRICT grid,
      |            ^~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:579:14: warning: ‘make_qkx1_quants’ defined but not used [-Wunused-function]
  579 | static float make_qkx1_quants(int n, int nmax, const float * GGML_RESTRICT x, uint8_t * GGML_RESTRICT L, float * GGML_RESTRICT the_min,
      |              ^~~~~~~~~~~~~~~~
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:5:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
[  8%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o -MF CMakeFiles/ggml-base.dir/gguf.cpp.o.d -o CMakeFiles/ggml-base.dir/gguf.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp
In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp:3:
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function]
  256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function]
  187 | static size_t ggml_bitset_size(size_t n) {
      |               ^~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function]
  150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function]
  145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) {
      |             ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) {
      |              ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function]
  135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) {
      |                ^~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function]
  129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) {
      |             ^~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function]
   77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) {
      |             ^~~~~~~~~~~~~~~~~~~~
[  8%] Linking CXX shared library ../../../../../lib/ollama/libggml-base.so
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-base.dir/link.txt --verbose=1
/usr/bin/g++ -fPIC -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -Wl,--dependency-file=CMakeFiles/ggml-base.dir/link.d -Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared -Wl,-soname,libggml-base.so -o ../../../../../lib/ollama/libggml-base.so "CMakeFiles/ggml-base.dir/ggml.c.o" "CMakeFiles/ggml-base.dir/ggml.cpp.o" "CMakeFiles/ggml-base.dir/ggml-alloc.c.o" "CMakeFiles/ggml-base.dir/ggml-backend.cpp.o" "CMakeFiles/ggml-base.dir/ggml-opt.cpp.o" "CMakeFiles/ggml-base.dir/ggml-threading.cpp.o" "CMakeFiles/ggml-base.dir/ggml-quants.c.o" "CMakeFiles/ggml-base.dir/gguf.cpp.o"  -lm
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
[  8%] Built target ggml-base
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/depend
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/DependInfo.cmake "--color="
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
/usr/bin/gmake  -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build
gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
[  8%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o
[ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o
[ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o
[ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o -MF CMakeFiles/ggml-cuda.dir/acc.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/acc.cu -o CMakeFiles/ggml-cuda.dir/acc.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o -MF CMakeFiles/ggml-cuda.dir/add-id.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/add-id.cu -o CMakeFiles/ggml-cuda.dir/add-id.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o -MF CMakeFiles/ggml-cuda.dir/arange.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/arange.cu -o CMakeFiles/ggml-cuda.dir/arange.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o -MF CMakeFiles/ggml-cuda.dir/argmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argmax.cu -o CMakeFiles/ggml-cuda.dir/argmax.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future relenvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
ase (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o -MF CMakeFiles/ggml-cuda.dir/argsort.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argsort.cu -o CMakeFiles/ggml-cuda.dir/argsort.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o -MF CMakeFiles/ggml-cuda.dir/binbcast.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/binbcast.cu -o CMakeFiles/ggml-cuda.dir/binbcast.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o -MF CMakeFiles/ggml-cuda.dir/clamp.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/clamp.cu -o CMakeFiles/ggml-cuda.dir/clamp.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o -MF CMakeFiles/ggml-cuda.dir/concat.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/concat.cu -o CMakeFiles/ggml-cuda.dir/concat.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o -MF CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv-transpose-1d.cu -o CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-dw.cu -o CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-transpose.cu -o CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o -MF CMakeFiles/ggml-cuda.dir/convert.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/convert.cu -o CMakeFiles/ggml-cuda.dir/convert.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o -MF CMakeFiles/ggml-cuda.dir/count-equal.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/count-equal.cu -o CMakeFiles/ggml-cuda.dir/count-equal.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o -MF CMakeFiles/ggml-cuda.dir/cpy.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cpy.cu -o CMakeFiles/ggml-cuda.dir/cpy.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o -MF CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cross-entropy-loss.cu -o CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o -MF CMakeFiles/ggml-cuda.dir/diagmask.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/diagmask.cu -o CMakeFiles/ggml-cuda.dir/diagmask.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f32.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-wmma-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn.cu -o CMakeFiles/ggml-cuda.dir/fattn.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o -MF CMakeFiles/ggml-cuda.dir/getrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/getrows.cu -o CMakeFiles/ggml-cuda.dir/getrows.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o -MF CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu -o CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o -MF CMakeFiles/ggml-cuda.dir/gla.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/gla.cu -o CMakeFiles/ggml-cuda.dir/gla.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o -MF CMakeFiles/ggml-cuda.dir/im2col.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/im2col.cu -o CMakeFiles/ggml-cuda.dir/im2col.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o -MF CMakeFiles/ggml-cuda.dir/mean.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mean.cu -o CMakeFiles/ggml-cuda.dir/mean.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmf.cu -o CMakeFiles/ggml-cuda.dir/mmf.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmq.cu -o CMakeFiles/ggml-cuda.dir/mmq.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvf.cu -o CMakeFiles/ggml-cuda.dir/mmvf.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvq.cu -o CMakeFiles/ggml-cuda.dir/mmvq.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o -MF CMakeFiles/ggml-cuda.dir/norm.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/norm.cu -o CMakeFiles/ggml-cuda.dir/norm.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o -MF CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/opt-step-adamw.cu -o CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o -MF CMakeFiles/ggml-cuda.dir/out-prod.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/out-prod.cu -o CMakeFiles/ggml-cuda.dir/out-prod.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o -MF CMakeFiles/ggml-cuda.dir/pad.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pad.cu -o CMakeFiles/ggml-cuda.dir/pad.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o -MF CMakeFiles/ggml-cuda.dir/pool2d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pool2d.cu -o CMakeFiles/ggml-cuda.dir/pool2d.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o -MF CMakeFiles/ggml-cuda.dir/quantize.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/quantize.cu -o CMakeFiles/ggml-cuda.dir/quantize.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o -MF CMakeFiles/ggml-cuda.dir/roll.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/roll.cu -o CMakeFiles/ggml-cuda.dir/roll.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o -MF CMakeFiles/ggml-cuda.dir/rope.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/rope.cu -o CMakeFiles/ggml-cuda.dir/rope.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o -MF CMakeFiles/ggml-cuda.dir/scale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/scale.cu -o CMakeFiles/ggml-cuda.dir/scale.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o -MF CMakeFiles/ggml-cuda.dir/set-rows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/set-rows.cu -o CMakeFiles/ggml-cuda.dir/set-rows.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o -MF CMakeFiles/ggml-cuda.dir/softcap.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softcap.cu -o CMakeFiles/ggml-cuda.dir/softcap.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o -MF CMakeFiles/ggml-cuda.dir/softmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softmax.cu -o CMakeFiles/ggml-cuda.dir/softmax.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-conv.cu -o CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-scan.cu -o CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o -MF CMakeFiles/ggml-cuda.dir/sum.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sum.cu -o CMakeFiles/ggml-cuda.dir/sum.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o -MF CMakeFiles/ggml-cuda.dir/sumrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sumrows.cu -o CMakeFiles/ggml-cuda.dir/sumrows.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o -MF CMakeFiles/ggml-cuda.dir/tsembd.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/tsembd.cu -o CMakeFiles/ggml-cuda.dir/tsembd.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o -MF CMakeFiles/ggml-cuda.dir/unary.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/unary.cu -o CMakeFiles/ggml-cuda.dir/unary.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o -MF CMakeFiles/ggml-cuda.dir/upscale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/upscale.cu -o CMakeFiles/ggml-cuda.dir/upscale.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o -MF CMakeFiles/ggml-cuda.dir/wkv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/wkv.cu -o CMakeFiles/ggml-cuda.dir/wkv.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq1_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_nl.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-mxfp4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q2_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q3_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q6_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[100%] Linking CUDA shared module ../../../../../../lib/ollama/libggml-cuda.so
cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-cuda.dir/link.txt --verbose=1
/usr/bin/g++-14 -fPIC -Wl,--dependency-file=CMakeFiles/ggml-cuda.dir/link.d -Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared  -o ../../../../../../lib/ollama/libggml-cuda.so @CMakeFiles/ggml-cuda.dir/objects1.rsp @CMakeFiles/ggml-cuda.dir/linkLibs.rsp -L"/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/lib/stubs" -L"/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/lib"
gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
[100%] Built target ggml-cuda
gmake[2]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'
/usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/CMakeFiles 0
gmake[1]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12'

+ RPM_EC=0
++ jobs -p
+ exit 0
Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.frFtDr
+ umask 022
+ cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ '[' /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT '!=' / ']'
+ rm -rf /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT
++ dirname /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT
+ mkdir -p /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ mkdir /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT
+ CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer  '
+ export CFLAGS
+ CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer  '
+ export CXXFLAGS
+ FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules  '
+ export FFLAGS
+ FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules  '
+ export FCFLAGS
+ VALAFLAGS=-g
+ export VALAFLAGS
+ RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn'
+ export RUSTFLAGS
+ LDFLAGS='-Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes  '
+ export LDFLAGS
+ LT_SYS_LIBRARY_PATH=/usr/lib64:
+ export LT_SYS_LIBRARY_PATH
+ CC=gcc
+ export CC
+ CXX=g++
+ export CXX
+ cd ollama-0.12.3
+ DESTDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT
+ /usr/bin/cmake --install redhat-linux-build_cuda-13 --component CUDA
-- Install configuration: "Release"
-- Installing: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v13/libggml-cuda.so
-- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v13/libggml-cuda.so" to ""
+ DESTDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT
+ /usr/bin/cmake --install redhat-linux-build_cuda-12 --component CUDA
-- Install configuration: "Release"
-- Installing: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v12/libggml-cuda.so
-- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v12/libggml-cuda.so" to ""
+ /usr/bin/find-debuginfo -j4 --strict-build-id -m -i --build-id-seed 0.12.3-1.fc43 --unique-debug-suffix -0.12.3-1.fc43.x86_64 --unique-debug-src-base ollama-ggml-cuda-0.12.3-1.fc43.x86_64 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 110000000 -S debugsourcefiles.list /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3
find-debuginfo: starting
Extracting debug info from 2 files
DWARF-compressing 2 files
sepdebugcrcfix: Updated 2 CRC32s, 0 CRC32s did match.
Creating .debug symlinks for symlinks to ELF files
Copying sources found by 'debugedit -l' to /usr/src/debug/ollama-ggml-cuda-0.12.3-1.fc43.x86_64
find-debuginfo: done
+ /usr/lib/rpm/check-buildroot
+ /usr/lib/rpm/redhat/brp-ldconfig
+ /usr/lib/rpm/brp-compress
+ /usr/lib/rpm/redhat/brp-strip-lto /usr/bin/strip
+ /usr/lib/rpm/check-rpaths
+ /usr/lib/rpm/redhat/brp-mangle-shebangs
+ /usr/lib/rpm/brp-remove-la-files
+ /usr/lib/rpm/redhat/brp-python-rpm-in-distinfo
+ env /usr/lib/rpm/redhat/brp-python-bytecompile '' 1 0 -j4
+ /usr/lib/rpm/redhat/brp-python-hardlink
+ /usr/bin/add-determinism --brp -j4 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT
Scanned 39 directories and 162 files,
               processed 0 inodes,
               0 modified (0 replaced + 0 rewritten),
               0 unsupported format, 0 errors
Reading /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/SPECPARTS/rpm-debuginfo.specpart
Processing files: ollama-ggml-cuda-13-0.12.3-1.fc43.x86_64
Executing(%license): /bin/sh -e /var/tmp/rpm-tmp.pE1mjN
+ umask 022
+ cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ cd ollama-0.12.3
+ LICENSEDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-13
+ export LC_ALL=C.UTF-8
+ LC_ALL=C.UTF-8
+ export LICENSEDIR
+ /usr/bin/mkdir -p /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-13
+ cp -pr /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/LICENSE /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-13
+ RPM_EC=0
++ jobs -p
+ exit 0
Provides: libggml-cuda.so()(64bit) ollama-ggml-cuda-13 = 0.12.3-1.fc43 ollama-ggml-cuda-13(x86-64) = 0.12.3-1.fc43
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Requires: libc.so.6()(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_ABI_DT_RELR)(64bit) libcublas.so.13()(64bit) libcublas.so.13(libcublas.so.13)(64bit) libcuda.so.1()(64bit) libcudart.so.13()(64bit) libcudart.so.13(libcudart.so.13)(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libm.so.6()(64bit) libm.so.6(GLIBC_2.27)(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.30)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) rtld(GNU_HASH)
Supplements: if libcublas-13-0 ollama-ggml
Processing files: ollama-ggml-cuda-12-0.12.3-1.fc43.x86_64
Executing(%license): /bin/sh -e /var/tmp/rpm-tmp.fgRqVh
+ umask 022
+ cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ cd ollama-0.12.3
+ LICENSEDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-12
+ export LC_ALL=C.UTF-8
+ LC_ALL=C.UTF-8
+ export LICENSEDIR
+ /usr/bin/mkdir -p /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-12
+ cp -pr /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/LICENSE /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-12
+ RPM_EC=0
++ jobs -p
+ exit 0
Provides: libggml-cuda.so()(64bit) ollama-ggml-cuda-12 = 0.12.3-1.fc43 ollama-ggml-cuda-12(x86-64) = 0.12.3-1.fc43
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Requires: libc.so.6()(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_ABI_DT_RELR)(64bit) libcublas.so.12()(64bit) libcublas.so.12(libcublas.so.12)(64bit) libcuda.so.1()(64bit) libcudart.so.12()(64bit) libcudart.so.12(libcudart.so.12)(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libm.so.6()(64bit) libm.so.6(GLIBC_2.27)(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.30)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) rtld(GNU_HASH)
Supplements: if libcublas-12-9 ollama-ggml
Processing files: ollama-ggml-cuda-debugsource-0.12.3-1.fc43.x86_64
Provides: ollama-ggml-cuda-debugsource = 0.12.3-1.fc43 ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc43
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Processing files: ollama-ggml-cuda-debuginfo-0.12.3-1.fc43.x86_64
Provides: ollama-ggml-cuda-debuginfo = 0.12.3-1.fc43 ollama-ggml-cuda-debuginfo(x86-64) = 0.12.3-1.fc43
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Recommends: ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc43
Processing files: ollama-ggml-cuda-13-debuginfo-0.12.3-1.fc43.x86_64
Provides: debuginfo(build-id) = 756cf1705443b059226e82334d07056834b9e1cd libggml-cuda.so-0.12.3-1.fc43.x86_64.debug()(64bit) ollama-ggml-cuda-13-debuginfo = 0.12.3-1.fc43 ollama-ggml-cuda-13-debuginfo(x86-64) = 0.12.3-1.fc43
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Recommends: ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc43
Processing files: ollama-ggml-cuda-12-debuginfo-0.12.3-1.fc43.x86_64
Provides: debuginfo(build-id) = b39287180b9e5b05eb972c11586ef26cef286bf8 libggml-cuda.so-0.12.3-1.fc43.x86_64.debug()(64bit) ollama-ggml-cuda-12-debuginfo = 0.12.3-1.fc43 ollama-ggml-cuda-12-debuginfo(x86-64) = 0.12.3-1.fc43
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Recommends: ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc43
Checking for unpackaged file(s): /usr/lib/rpm/check-files /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT
Wrote: /builddir/build/RPMS/ollama-ggml-cuda-12-debuginfo-0.12.3-1.fc43.x86_64.rpm
Wrote: /builddir/build/RPMS/ollama-ggml-cuda-debugsource-0.12.3-1.fc43.x86_64.rpm
Wrote: /builddir/build/RPMS/ollama-ggml-cuda-13-debuginfo-0.12.3-1.fc43.x86_64.rpm
Wrote: /builddir/build/RPMS/ollama-ggml-cuda-debuginfo-0.12.3-1.fc43.x86_64.rpm
Wrote: /builddir/build/RPMS/ollama-ggml-cuda-13-0.12.3-1.fc43.x86_64.rpm
Wrote: /builddir/build/RPMS/ollama-ggml-cuda-12-0.12.3-1.fc43.x86_64.rpm
Executing(rmbuild): /bin/sh -e /var/tmp/rpm-tmp.4sJVuw
+ umask 022
+ cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ test -d /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ rm -rf /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build
+ RPM_EC=0
++ jobs -p
+ exit 0
Finish: rpmbuild ollama-ggml-cuda-0.12.3-1.fc43.src.rpm
Finish: build phase for ollama-ggml-cuda-0.12.3-1.fc43.src.rpm
INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan
INFO: /var/lib/mock/fedora-43-x86_64-1759434727.591343/root/var/log/dnf5.log
INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz
/bin/tar: Removing leading `/' from member names
INFO: Done(/var/lib/copr-rpmbuild/results/ollama-ggml-cuda-0.12.3-1.fc43.src.rpm) Config(child) 106 minutes 47 seconds
INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results
INFO: Cleaning up build root ('cleanup_on_success=True')
Start: clean chroot
INFO: unmounting tmpfs.
Finish: clean chroot
Finish: run
Running RPMResults tool
Package info:
{
    "packages": [
        {
            "name": "ollama-ggml-cuda-12",
            "epoch": null,
            "version": "0.12.3",
            "release": "1.fc43",
            "arch": "x86_64"
        },
        {
            "name": "ollama-ggml-cuda-13-debuginfo",
            "epoch": null,
            "version": "0.12.3",
            "release": "1.fc43",
            "arch": "x86_64"
        },
        {
            "name": "ollama-ggml-cuda",
            "epoch": null,
            "version": "0.12.3",
            "release": "1.fc43",
            "arch": "src"
        },
        {
            "name": "ollama-ggml-cuda-debuginfo",
            "epoch": null,
            "version": "0.12.3",
            "release": "1.fc43",
            "arch": "x86_64"
        },
        {
            "name": "ollama-ggml-cuda-12-debuginfo",
            "epoch": null,
            "version": "0.12.3",
            "release": "1.fc43",
            "arch": "x86_64"
        },
        {
            "name": "ollama-ggml-cuda-13",
            "epoch": null,
            "version": "0.12.3",
            "release": "1.fc43",
            "arch": "x86_64"
        },
        {
            "name": "ollama-ggml-cuda-debugsource",
            "epoch": null,
            "version": "0.12.3",
            "release": "1.fc43",
            "arch": "x86_64"
        }
    ]
}
RPMResults finished