Skip to content

[libcu++][cuda::ptx] Add prefetch PTX wrappers#9620

Open
gonidelis wants to merge 1 commit into
NVIDIA:mainfrom
gonidelis:add_ptx_prefetch
Open

[libcu++][cuda::ptx] Add prefetch PTX wrappers#9620
gonidelis wants to merge 1 commit into
NVIDIA:mainfrom
gonidelis:add_ptx_prefetch

Conversation

@gonidelis

Copy link
Copy Markdown
Member

Second try to fix #9616

I took Fede's advice and tried to use the internal libcucxx_ptx tool and then added a test. Given that both Bernhard and Fede are out I'll ask @ahendriksen to pinpoint any wrongdoings he might notice.

@gonidelis gonidelis requested a review from a team as a code owner June 28, 2026 17:42
@gonidelis gonidelis requested a review from ericniebler June 28, 2026 17:42
@github-project-automation github-project-automation Bot moved this to Todo in CCCL Jun 28, 2026
@cccl-authenticator-app cccl-authenticator-app Bot moved this from Todo to In Review in CCCL Jun 28, 2026
@coderabbitai

coderabbitai Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Added CUDA PTX prefetch instruction support to the public cuda::ptx header set.
    • New prefetch-related declarations are now available through the main PTX include.
  • Tests

    • Added a compile-time test covering the new prefetch PTX wrapper inclusion.

Walkthrough

Adds cuda/__ptx/instructions/prefetch.h, a new PTX instruction wrapper header with include guard, compiler system-header pragmas, and namespace wrappers delegating to a generated prefetch.h. Wires it into the cuda::ptx public header and adds a minimal compilation-pass test.

prefetch PTX wrapper

Layer / File(s) Summary
prefetch header and public namespace wiring
libcudacxx/include/cuda/__ptx/instructions/prefetch.h, libcudacxx/include/cuda/ptx
New prefetch.h with include guard, compiler system-header pragmas, and namespace prologue/epilogue including the generated definitions; added to cuda/ptx at line 97.
Compilation-pass test
libcudacxx/test/libcudacxx/cuda/ptx/ptx.prefetch.compile.pass.cpp
New test including <cuda/ptx> and the generated prefetch header with a threads-unsupported annotation and a main returning 0.

Assessment against linked issues

Objective Addressed Explanation
Introduce cuda::ptx::prefetch() assembly wrapper [#9616] This PR adds the prefetch.h wrapper header and wiring, but the generated cuda/__ptx/instructions/generated/prefetch.h (containing the actual prefetch() declarations) is not included in this diff, so it is unclear whether the assembly wrapper itself is present and correct.

Out-of-scope changes

No out-of-scope changes found.


Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
libcudacxx/include/cuda/__ptx/instructions/prefetch.h (1)

12-13: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

suggestion: Rename the include guard to the path-derived form used in libcudacxx headers. _CUDA_PTX_PREFETCH_H_ is too generic for cuda/__ptx/instructions/prefetch.h and risks collisions with future headers. As per coding guidelines, "Headers must use include guards derived from the uppercase full path."

Also applies to: 40-40

Source: Coding guidelines


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3f919381-8399-4ebf-9476-f350f32fafdc

📥 Commits

Reviewing files that changed from the base of the PR and between 7e585c9 and 01ce259.

⛔ Files ignored due to path filters (2)
  • libcudacxx/include/cuda/__ptx/instructions/generated/prefetch.h is excluded by !**/generated/**
  • libcudacxx/test/libcudacxx/cuda/ptx/generated/prefetch.h is excluded by !**/generated/**
📒 Files selected for processing (3)
  • libcudacxx/include/cuda/__ptx/instructions/prefetch.h
  • libcudacxx/include/cuda/ptx
  • libcudacxx/test/libcudacxx/cuda/ptx/ptx.prefetch.compile.pass.cpp

@github-actions

Copy link
Copy Markdown
Contributor

😬 CI Workflow Results

🟥 Finished in 1h 11m: Pass: 99%/120 | Total: 2d 04h | Max: 1h 03m | Hits: 78%/484549

See results here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

Introduce cuda::ptx::prefetch() assembly wrapper in libcu++

1 participant