feat(gpu): add support for GPU-accelerated expand on the HL Api #2276

pdroalves · 2025-04-16T14:47:57Z

closes: https://github.com/zama-ai/tfhe-rs-internal/issues/986

PR content/description

Check-list:

Tests for the changes have been added (for bug fixes / features)
Docs have been added / updated (for bug fixes / features)
Relevant issues are marked as resolved/closed, related issues are linked in the description
Check for breaking changes (including serialization changes) and add them to commit message following the conventional commit specification

This change is

agnesLeroy

Hey @pdroalves! Amazing to get expand on GPU in the HL API! 🎉 I have some questions about the PR, and also I'm thinking it would be good to have some documentation similar to tfhe/docs/fhe-computation/advanced-features/zk-pok.md showing how to run expand on GPU, in a new file maybe tfhe/docs/configuration/gpu_acceleration/zk-pok.md, where we would explain the ZK proof and verif can only be executed on CPU, linking to the existing zk-pok.md file, and adding a part about expand on GPU, wdyt?

Reviewed 8 of 15 files at r1, all commit messages.
Reviewable status: 8 of 15 files reviewed, 8 unresolved discussions (waiting on @IceTDrinker)

tfhe/src/integer/gpu/key_switching_key/mod.rs line 11 at r1 (raw file):

#[derive(Clone)]
#[allow(dead_code, clippy::struct_field_names)]

Do we really need allow(dead_code)?

tfhe/src/integer/gpu/key_switching_key/mod.rs line 17 at r1 (raw file):

}

#[allow(dead_code, clippy::struct_field_names)]

Do we really need allow(dead_code)?

tfhe/src/high_level_api/compact_list.rs line 4 at r1 (raw file):

use crate::backward_compatibility::compact_list::CompactCiphertextListVersions;
#[cfg(feature = "zk-pok")]

Is it normal this import is removed?

tfhe/src/high_level_api/compact_list.rs line 28 at r1 (raw file):

use crate::{CompactPublicKey, Tag};

#[cfg(feature = "strings")]

Same question here

tfhe/src/high_level_api/compact_list.rs line 235 at r1 (raw file):

#[cfg(feature = "zk-pok")]
pub mod zk {

Why does the module become pub?

tfhe/src/high_level_api/compact_list.rs line 252 at r1 (raw file):

    }

    #[derive(Serialize, Deserialize, Versionize)]

Here it doesn't derive Clone anymore?

tfhe/src/high_level_api/compact_list.rs line 255 at r1 (raw file):

    #[versionize(ProvenCompactCiphertextListVersions)]
    pub struct ProvenCompactCiphertextList {
        pub(in crate::high_level_api) inner: InnerProvenCompactCiphertextList,

Why this pub(in crate::high_level_api)?

tfhe/src/high_level_api/keys/inner.rs line 315 at r1 (raw file):

#[cfg(feature = "gpu")]
#[allow(dead_code)]

Do we really need to allow dead_code? Isn't there something missing in the prelude instead? 🤔

IceTDrinker · 2025-04-18T08:24:20Z

note that clippy::struct_field_names is not necessary anymore, the lint was a potential problem for backward compatibility (as the lint could randomly start complaining and a developer could change the name, breaking the compatibility on some structs

IceTDrinker

Haven't read everything but it seems to me there is something that's not right in the way part of the code is currently designed, an expander is not expected to be serialized/versioned, it's the object that comes after an expansion and allows to retrieve data from it, the compact list it is derived from can be serialized, the resulting FheUint etc. from the expander as well, but the expander is only there to make the conversion

Reviewed 8 of 15 files at r1, all commit messages.
Reviewable status: 10 of 15 files reviewed, 12 unresolved discussions (waiting on @agnesLeroy and @pdroalves)

tfhe/src/high_level_api/compact_list.rs line 4 at r1 (raw file):

Previously, agnesLeroy (Agnès Leroy) wrote…

Is it normal this import is removed?

No, that does not look reasonable

tfhe/src/high_level_api/compact_list.rs line 28 at r1 (raw file):

Previously, agnesLeroy (Agnès Leroy) wrote…

Same question here

this one was moved a few lines above it seems

tfhe/src/high_level_api/compact_list.rs line 185 at r1 (raw file):

    }

    pub fn expand(&self) -> crate::Result<CompactCiphertextListExpander> {

The way the function is written means expansion is always done on CPU in that case ?

tfhe/src/integer/backward_compatibility/ciphertext/mod.rs line 120 at r1 (raw file):

#[derive(VersionsDispatch)]
pub enum CompactCiphertextListExpanderVersions {

I'm confused, the expander is never serialized right ?

tfhe/src/integer/ciphertext/compact_list.rs line 309 at r1 (raw file):

#[derive(Versionize)]
#[versionize(CompactCiphertextListExpanderVersions)]

the expander is not supposed to be serialized right ? so then why does it need versioning ? 🤔

tfhe/src/core_crypto/gpu/entities/lwe_keyswitch_key.rs line 13 at r1 (raw file):

};

#[allow(dead_code)]

as Agnès is mentioning dead code stuff, given this one is public, is it necessary ?

agnesLeroy · 2025-04-18T09:16:44Z

For the clippy::struct_field_names remark you'll need to rebase on main

pdroalves

Regarding struct_field_names, I just rebased and it is gone.

@agnesLeroy Yes, I think it's a good idea. The code to be shown is probably not needed, but the explanation about verify on CPU and expand on GPU is important. I will write that.

@IceTDrinker Yeah, actually the expander was not being serialized, those were only artifacts from a confusing moment during development. I just removed it.

Reviewable status: 10 of 15 files reviewed, 12 unresolved discussions (waiting on @agnesLeroy and @IceTDrinker)

tfhe/src/core_crypto/gpu/entities/lwe_keyswitch_key.rs line 13 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

as Agnès is mentioning dead code stuff, given this one is public, is it necessary ?

I've added this to fix a warning about ciphertext_modulus:

warning: field `ciphertext_modulus` is never read
  --> tfhe/src/core_crypto/gpu/entities/lwe_keyswitch_key.rs:20:5
   |
14 | pub struct CudaLweKeyswitchKey<T: UnsignedInteger> {
   |            ------------------- field in this struct
...
20 |     ciphertext_modulus: CiphertextModulus<T>,
   |     ^^^^^^^^^^^^^^^^^^
   |
   = note: `CudaLweKeyswitchKey` has a derived impl for the trait `Clone`, but this is intentionally ignored during dead code analysis
   = note: `#[warn(dead_code)]` on by default

Do you have any suggestions for handling this differently?

tfhe/src/high_level_api/compact_list.rs line 4 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

No, that does not look reasonable

They were not removed, only moved a few lines below. ProvenCompactCiphertextListVersions in particular is not inside mod zk.

tfhe/src/high_level_api/compact_list.rs line 28 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

this one was moved a few lines above it seems

Yes, it was also moved.

tfhe/src/high_level_api/compact_list.rs line 185 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

The way the function is written means expansion is always done on CPU in that case ?

Yes, but it doesn't need to. The integer API doesn't have support to expand CompactCiphertextList, only ProvenCompactCiphertextList and so the HLAPI. I guess I could just generalize that and have the same method for the former. Wdyt?

tfhe/src/high_level_api/compact_list.rs line 235 at r1 (raw file):

Previously, agnesLeroy (Agnès Leroy) wrote…

Why does the module become pub?

Shouldn't. I probably modified it temporarily for something during development and forgot to remove. Removed!

tfhe/src/high_level_api/compact_list.rs line 252 at r1 (raw file):

Previously, agnesLeroy (Agnès Leroy) wrote…

Here it doesn't derive Clone anymore?

I had to remove it when I refactored ProvenCompactCiphertextList and introduced InnerProvenCompactCiphertextList. I just added again and implemented Clone for InnerProvenCompactCiphertextList.

tfhe/src/high_level_api/compact_list.rs line 255 at r1 (raw file):

Previously, agnesLeroy (Agnès Leroy) wrote…

Why this pub(in crate::high_level_api)?

I probably copied this from somewhere else. Not needed. I changed it back to pub(crate).

tfhe/src/high_level_api/keys/inner.rs line 315 at r1 (raw file):

Previously, agnesLeroy (Agnès Leroy) wrote…

Do we really need to allow dead_code? Isn't there something missing in the prelude instead? 🤔

If I remove this Cargo will complain about cpk_key_switching_key_material not being used. I guess it's because it is only used when the gpu + zk-pok features are activated.

error: field `cpk_key_switching_key_material` is never read
   --> tfhe/src/high_level_api/keys/inner.rs:317:16
    |
315 | pub struct IntegerCudaServerKey {
    |            -------------------- field in this struct
316 |     pub(crate) key: crate::integer::gpu::CudaServerKey,
317 |     pub(crate) cpk_key_switching_key_material:
    |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tfhe/src/integer/backward_compatibility/ciphertext/mod.rs line 120 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

I'm confused, the expander is never serialized right ?

Correct. At some point I was mixing the expander with the proven compact list inside my mind and making a confusion. I removed the versioning for somewhere else in the code but forgot to remove this one.

tfhe/src/integer/ciphertext/compact_list.rs line 309 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

the expander is not supposed to be serialized right ? so then why does it need versioning ? 🤔

Never. Removed.

tfhe/src/integer/gpu/key_switching_key/mod.rs line 11 at r1 (raw file):

Previously, agnesLeroy (Agnès Leroy) wrote…

Do we really need allow(dead_code)?

I guess so. Without it I get

error: fields `key_switching_key` and `destination_key` are never read
  --> tfhe/src/integer/gpu/key_switching_key/mod.rs:13:16
   |
12 | pub struct CudaKeySwitchingKeyMaterial {
   |            --------------------------- fields in this struct
13 |     pub(crate) key_switching_key: CudaLweKeyswitchKey<u64>,
   |                ^^^^^^^^^^^^^^^^^
14 |     pub(crate) destination_key: EncryptionKeyChoice,
   |                ^^^^^^^^^^^^^^^
   |
   = note: `CudaKeySwitchingKeyMaterial` has a derived impl for the trait `Clone`, but this is intentionally ignored during dead code analysis
   = note: `-D dead-code` implied by `-D warnings`
   = help: to override `-D warnings` add `#[allow(dead_code)]`

error: fields `key_switching_key_material` and `dest_server_key` are never read
  --> tfhe/src/integer/gpu/key_switching_key/mod.rs:19:16
   |
18 | pub struct CudaKeySwitchingKey<'keys> {
   |            ------------------- fields in this struct
19 |     pub(crate) key_switching_key_material: CudaKeySwitchingKeyMaterial,
   |                ^^^^^^^^^^^^^^^^^^^^^^^^^^
20 |     pub(crate) dest_server_key: &'keys CudaServerKey,

in pcc_gpu. Again, I suppose this happens because CudaKeySwitchingKey is only used in the HLAPI behind gpu + zkpok. Is there any other way to handle this?

tfhe/src/integer/gpu/key_switching_key/mod.rs line 17 at r1 (raw file):

Previously, agnesLeroy (Agnès Leroy) wrote…

Do we really need allow(dead_code)?

Answered above.

pdroalves

@agnesLeroy I just pushed a first text version. Wdyt?

Reviewable status: 10 of 15 files reviewed, 12 unresolved discussions (waiting on @agnesLeroy and @IceTDrinker)

agnesLeroy

Reviewable status: 4 of 16 files reviewed, 15 unresolved discussions (waiting on @IceTDrinker and @pdroalves)

tfhe/src/core_crypto/gpu/entities/lwe_keyswitch_key.rs line 13 at r1 (raw file):

Previously, pdroalves (Pedro Alves) wrote…

I've added this to fix a warning about ciphertext_modulus:

warning: field `ciphertext_modulus` is never read
  --> tfhe/src/core_crypto/gpu/entities/lwe_keyswitch_key.rs:20:5
   |
14 | pub struct CudaLweKeyswitchKey<T: UnsignedInteger> {
   |            ------------------- field in this struct
...
20 |     ciphertext_modulus: CiphertextModulus<T>,
   |     ^^^^^^^^^^^^^^^^^^
   |
   = note: `CudaLweKeyswitchKey` has a derived impl for the trait `Clone`, but this is intentionally ignored during dead code analysis
   = note: `#[warn(dead_code)]` on by default

Do you have any suggestions for handling this differently?

Maybe something is missing in the prelude?

tfhe/src/high_level_api/keys/inner.rs line 315 at r1 (raw file):

Previously, pdroalves (Pedro Alves) wrote…

If I remove this Cargo will complain about cpk_key_switching_key_material not being used. I guess it's because it is only used when the gpu + zk-pok features are activated.
error: field `cpk_key_switching_key_material` is never read
   --> tfhe/src/high_level_api/keys/inner.rs:317:16
    |
315 | pub struct IntegerCudaServerKey {
    |            -------------------- field in this struct
316 |     pub(crate) key: crate::integer::gpu::CudaServerKey,
317 |     pub(crate) cpk_key_switching_key_material:
    |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

@IceTDrinker how should we deal with this, should we gate the cpk_key_switching_key_material with the zk-pok feature?

tfhe/docs/configuration/gpu_acceleration/zk-pok.md line 3 at r2 (raw file):

# Zero-knowledge proofs

This document explains how to speed up the verification of zero-knowledge proofs using the GPU, similar to how it's done on the [CPU](../../fhe-computation/advanced-features/zk-pok.md).

This is a bit confusing, as it leads the reader to think verification can be executed on the CPU. How about this instead?

Zero-knowledge (ZK) proof and verification can only be executed on CPU. However, the preprocessing needed to go from ciphertexts formatted for ZK to ciphertexts in the right format for computation purposes can be accelerated on GPU. ZK verification can be executed on the CPU concurrently with this preprocessing, also referred to as expansion, making the overall workflow faster.

tfhe/docs/configuration/gpu_acceleration/zk-pok.md line 14 at r3 (raw file):

{% endhint %}

Moreover, the GPU backend better performs when Multi-bit PBS parameters are used, as

I would remove this sentence, as it is not specific to this page.

tfhe/docs/configuration/gpu_acceleration/zk-pok.md line 19 at r3 (raw file):

## Example

The following example shows how a client can encrypt and prove a ciphertext, and how a server can verify and compute the ciphertext on the GPU:

"how a server can verify and compute the ciphertext on the GPU:" -> "how a server can verify the proof, preprocess the ciphertext and run a computation on it on GPU:"

pdroalves

Reviewable status: 4 of 16 files reviewed, 15 unresolved discussions (waiting on @agnesLeroy and @IceTDrinker)

tfhe/docs/configuration/gpu_acceleration/zk-pok.md line 3 at r2 (raw file):

Previously, agnesLeroy (Agnès Leroy) wrote…

This is a bit confusing, as it leads the reader to think verification can be executed on the CPU. How about this instead?

Zero-knowledge (ZK) proof and verification can only be executed on CPU. However, the preprocessing needed to go from ciphertexts formatted for ZK to ciphertexts in the right format for computation purposes can be accelerated on GPU. ZK verification can be executed on the CPU concurrently with this preprocessing, also referred to as expansion, making the overall workflow faster.

Indeed!

I rewrote this paragraph using your text as reference. Also, I added a new subsection about proven compact lists.

tfhe/docs/configuration/gpu_acceleration/zk-pok.md line 14 at r3 (raw file):

Previously, agnesLeroy (Agnès Leroy) wrote…

I would remove this sentence, as it is not specific to this page.

Done.

tfhe/docs/configuration/gpu_acceleration/zk-pok.md line 19 at r3 (raw file):

Previously, agnesLeroy (Agnès Leroy) wrote…

"how a server can verify and compute the ciphertext on the GPU:" -> "how a server can verify the proof, preprocess the ciphertext and run a computation on it on GPU:"

Done.

tfhe/src/core_crypto/gpu/entities/lwe_keyswitch_key.rs line 13 at r1 (raw file):

Previously, agnesLeroy (Agnès Leroy) wrote…

Maybe something is missing in the prelude?

Don't think so. What I think is happening is that we indeed don't use that attribute. I guess I could just remove it from CudaLweKeyswitchKey. Wdyt?

pdroalves

Reviewable status: 4 of 16 files reviewed, 15 unresolved discussions (waiting on @agnesLeroy and @IceTDrinker)

tfhe/src/high_level_api/compact_list.rs line 185 at r1 (raw file):

Previously, pdroalves (Pedro Alves) wrote…

Yes, but it doesn't need to. The integer API doesn't have support to expand CompactCiphertextList, only ProvenCompactCiphertextList and so the HLAPI. I guess I could just generalize that and have the same method for the former. Wdyt?

I gave this topic some thoughts and I believe I need to do some refactoring. We have CudaCompactCiphertextList in the integer API but it doesn't have an expand function. This does not make sense. I will rewrite CudaProvenCompactCiphertextList to have a CudaCompactCiphertextList as attribute and call theirs expand when expand_without_verify() is called.

But will do that tomorrow.

IceTDrinker

Reviewed 4 of 15 files at r1, 7 of 8 files at r2, 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 33 unresolved discussions (waiting on @agnesLeroy and @pdroalves)

tfhe/src/core_crypto/gpu/entities/lwe_keyswitch_key.rs line 13 at r1 (raw file):

Previously, pdroalves (Pedro Alves) wrote…

Don't think so. What I think is happening is that we indeed don't use that attribute. I guess I could just remove it from CudaLweKeyswitchKey. Wdyt?

I would not remove it as it indicates how to interpret the data if required, dead code is fine by me

tfhe/src/high_level_api/compact_list.rs line 4 at r1 (raw file):

Previously, pdroalves (Pedro Alves) wrote…

They were not removed, only moved a few lines below. ProvenCompactCiphertextListVersions in particular is not inside mod zk.

ah right much better

tfhe/src/high_level_api/compact_list.rs line 28 at r1 (raw file):

Previously, pdroalves (Pedro Alves) wrote…

Yes, it was also moved.

though in general please don't move imports, it just adds noise to the git log/blame, so please move the strings back to where it was imported

tfhe/src/high_level_api/compact_list.rs line 185 at r1 (raw file):

Previously, pdroalves (Pedro Alves) wrote…

I gave this topic some thoughts and I believe I need to do some refactoring. We have CudaCompactCiphertextList in the integer API but it doesn't have an expand function. This does not make sense. I will rewrite CudaProvenCompactCiphertextList to have a CudaCompactCiphertextList as attribute and call theirs expand when expand_without_verify() is called.

But will do that tomorrow.

ok !

tfhe/src/high_level_api/compact_list.rs line 255 at r1 (raw file):

Previously, pdroalves (Pedro Alves) wrote…

I probably copied this from somewhere else. Not needed. I changed it back to pub(crate).

the pub(in crate::high_level_api) restricts the visibility, it's not necessarily a bad thing

tfhe/src/high_level_api/keys/inner.rs line 315 at r1 (raw file):

Previously, agnesLeroy (Agnès Leroy) wrote…

@IceTDrinker how should we deal with this, should we gate the cpk_key_switching_key_material with the zk-pok feature?

no, having objects changing layout depending on features is very error prone and annoying feature wise, but I would move the allow dead code on the cpk_key_switching_key_material

tfhe/src/integer/gpu/zk/mod.rs line 49 at r4 (raw file):

            .0
            .d_vec
            .gpu_indexes

indices are shared by all the lists in the vec ?

tfhe/docs/configuration/gpu_acceleration/zk-pok.md line 3 at r4 (raw file):

# Zero-knowledge proofs

Zero-knowledge proofs (ZK) are a powerful tool to assert that the encryption of a message is correct, as discussed in [advanced features](../../fhe-computation/advanced-features/zk-pok.md).

there is a trailing whitespace

tfhe/docs/configuration/gpu_acceleration/zk-pok.md line 4 at r4 (raw file):

Zero-knowledge proofs (ZK) are a powerful tool to assert that the encryption of a message is correct, as discussed in [advanced features](../../fhe-computation/advanced-features/zk-pok.md). 
However, computation is not possible on verifiable ciphertexts. This document explains how to use the GPU to accelerate the

this sentence is a bit odd, could maybe just re-use some of the wording of the CPU doc on ZKs, we don't use "verifiable ciphertexts" currently in the docs.

So either use the ProveCompactCiphertextList, name of the type or talk about the Compact Public Key encryption

tfhe/docs/configuration/gpu_acceleration/zk-pok.md line 10 at r4 (raw file):

## Proven compact ciphertext list

A proven compact list of ciphertexts can be seen as a compacted collection of ciphertexts which encryption can be verified.

trailing whitespace ?

tfhe/docs/configuration/gpu_acceleration/zk-pok.md line 11 at r4 (raw file):

A proven compact list of ciphertexts can be seen as a compacted collection of ciphertexts which encryption can be verified. 
This verification is currently only supported on the CPU, but the expansion can be accelerated using the GPU.

trailing whitespace ?

tfhe/docs/configuration/gpu_acceleration/zk-pok.md line 15 at r4 (raw file):

## Supported types
Encrypted messages can be integers (as FheUint64) or booleans. The GPU backend does not currently support encrypted strings.

like FheUint64

tfhe/src/core_crypto/gpu/entities/lwe_keyswitch_key.rs line 14 at r4 (raw file):

#[derive(Clone)]
#[allow(dead_code)]

to avoid noise in the git log keep this one above the [derive(Clone)] or put it directly on the ciphertext_modulus (I think it works)

tfhe/src/high_level_api/compact_list.rs line 415 at r4 (raw file):

                #[cfg(feature = "gpu")]
                InnerProvenCompactCiphertextList::Cuda(inner) => inner
                    .h_proved_lists

I see that you have the host list every time, I would try implementing something to access the CPU list via a

pub(crate) fn inner_cpu_list(&self) -> &Type

this way you can remove all the matches and call the same code logic once on the result of inner_cpu_list

valid for all places where you call inner.h_proved_lists

tfhe/src/high_level_api/compact_list.rs line 483 at r4 (raw file):

                    #[cfg(feature = "gpu")]
                    InnerProvenCompactCiphertextList::Cuda(_) => {
                        Err(crate::Error::new("Expected a Cuda server key".to_string()))

I know some error messages are not always great but this one could be a bit more verbose like

"Tried expanding a ProvenCompactCiphertextList on Cuda device, but the set ServerKey is a CPU ServerKey"

tfhe/src/high_level_api/compact_list.rs line 496 at r4 (raw file):

                                    .as_ref()
                                    .unwrap()
                                    .clone(),

clone ? 🤔

is this copying the whole key material ?

tfhe/src/high_level_api/compact_list.rs line 514 at r4 (raw file):

                    }
                    InnerProvenCompactCiphertextList::Cpu(_) => {
                        Err(crate::Error::new("Expected a CPU server key".to_string()))

in that case you could maybe do the move to device ? not sure what the pattern is in the HL for these things

tfhe/src/high_level_api/compact_list.rs line 554 at r4 (raw file):

                    #[cfg(feature = "gpu")]
                    InnerProvenCompactCiphertextList::Cuda(_) => {
                        Err(crate::Error::new("Expected a Cpu server key".to_string()))

potentially better error message as above

tfhe/src/high_level_api/compact_list.rs line 565 at r4 (raw file):

                                    .key
                                    .cpk_key_switching_key_material
                                    .clone()

does this copy data ?

tfhe/src/high_level_api/compact_list.rs line 578 at r4 (raw file):

                    }
                    InnerProvenCompactCiphertextList::Cpu(_) => {
                        Err(crate::Error::new("Expected a CPU server key".to_string()))

could move to device

tfhe/src/high_level_api/compact_list.rs line 828 at r4 (raw file):

                        tag: self.tag.clone(),
                    })
                    .unwrap();

you should not unwrap here as the function returns a result and the map can fail ?

tfhe/src/high_level_api/compact_list.rs line 842 at r4 (raw file):

                    } else {
                        // Handle other cases or panic if this should never happen
                        Err("Expected CPU variant".to_string())

better error message

tfhe/src/high_level_api/compact_list.rs line 848 at r4 (raw file):

                Ok(ProvenCompactCiphertextList {
                    inner: crate::high_level_api::compact_list::zk::InnerProvenCompactCiphertextList::Cuda(
                        gpu_proven_ct.unwrap(),

don't unwrap in a function returning a result

tfhe/src/high_level_api/keys/server.rs line 299 at r4 (raw file):

            .cpk_key_switching_key_material
            .as_ref()
            .map_or_else(

just map should be enough ? given the else branch is returning a None ?

tfhe/src/integer/gpu/ciphertext/compact_list.rs line 142 at r4 (raw file):

        let lwe_size = lwe_ciphertext_list.lwe_size().0;

        let expanded_blocks = lwe_ciphertext_list

actually there is an iterator if you do

use crate::core_crypto::prelude::ContiguousEntityContainer

then you can do

lwe_ciphertext_list.iter()...map(|(ct, info)| let lwe = LweCiphertext::from_container(ct.as_ref().to_vec(), ct.ciphertext_modulus());

Ciphertext::new(...))

it's not perfect but is less error_prone

For more context:

we are missing the "to_owned" notion on low level Core primitives but for now we haven't had time to think the design through, e.g. we don't know what we would do for some entities if we want to have aligned_vec in the owned version

tfhe/src/high_level_api/tests/tags_on_entities.rs line 170 at r4 (raw file):

    let compressed_server_key = CompressedServerKey::new(&cks);
    let gpu_sks = compressed_server_key.decompress_to_gpu();
    set_server_key(gpu_sks);

could be worth checking the tag on the gpu_sks, as it's checked on the other keys as well, might have been forgotten on the CPU test

pdroalves

Reviewable status: all files reviewed, 23 unresolved discussions (waiting on @agnesLeroy and @IceTDrinker)

tfhe/docs/configuration/gpu_acceleration/zk-pok.md line 4 at r4 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

this sentence is a bit odd, could maybe just re-use some of the wording of the CPU doc on ZKs, we don't use "verifiable ciphertexts" currently in the docs.

So either use the ProveCompactCiphertextList, name of the type or talk about the Compact Public Key encryption

Please check now.

tfhe/src/high_level_api/compact_list.rs line 4 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

ah right much better

Done.

tfhe/src/high_level_api/compact_list.rs line 28 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

though in general please don't move imports, it just adds noise to the git log/blame, so please move the strings back to where it was imported

Ok! I've moved them back.

tfhe/src/high_level_api/compact_list.rs line 185 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

ok !

@IceTDrinker I discussed this with @agnesLeroy and we believe this will be just too big to fit this PR. We prefer to merge it as is and proceed to implement GPU's expand on HL's CompactCiphertextList on a new PR. Are you ok with that?

tfhe/src/high_level_api/compact_list.rs line 255 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

the pub(in crate::high_level_api) restricts the visibility, it's not necessarily a bad thing

Done.

tfhe/src/high_level_api/compact_list.rs line 415 at r4 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

I see that you have the host list every time, I would try implementing something to access the CPU list via a
pub(crate) fn inner_cpu_list(&self) -> &Type
this way you can remove all the matches and call the same code logic once on the result of inner_cpu_list

valid for all places where you call inner.h_proved_lists

Great idea. Done.

tfhe/src/high_level_api/compact_list.rs line 483 at r4 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

I know some error messages are not always great but this one could be a bit more verbose like

"Tried expanding a ProvenCompactCiphertextList on Cuda device, but the set ServerKey is a CPU ServerKey"

Great suggestion. I will change the other error messages.

tfhe/src/high_level_api/compact_list.rs line 514 at r4 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

in that case you could maybe do the move to device ? not sure what the pattern is in the HL for these things

Not a bad idea. It's not clear to me what would make us get in that case. Probably a wrong use of something, and thus in that case I would like to throw an error so the user know something wrong happened. However, it also feels good to have a method so flexible that can recover from an error like that and still deliver a result.

I don't know what to pick here. @agnesLeroy wdyt?

tfhe/src/high_level_api/compact_list.rs line 828 at r4 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

you should not unwrap here as the function returns a result and the map can fail ?

Good catch.

tfhe/src/high_level_api/compact_list.rs line 842 at r4 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

better error message

Actually this else doesn't need to exist.

tfhe/src/high_level_api/keys/inner.rs line 315 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

no, having objects changing layout depending on features is very error prone and annoying feature wise, but I would move the allow dead code on the cpk_key_switching_key_material

Done.

tfhe/src/integer/gpu/ciphertext/compact_list.rs line 142 at r4 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

actually there is an iterator if you do

use crate::core_crypto::prelude::ContiguousEntityContainer

then you can do
lwe_ciphertext_list.iter()...map(|(ct, info)| let lwe = LweCiphertext::from_container(ct.as_ref().to_vec(), ct.ciphertext_modulus());

Ciphertext::new(...))
it's not perfect but is less error_prone

For more context:

we are missing the "to_owned" notion on low level Core primitives but for now we haven't had time to think the design through, e.g. we don't know what we would do for some entities if we want to have aligned_vec in the owned version

Cool, that iterator is really helpful.

tfhe/src/integer/gpu/zk/mod.rs line 49 at r4 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

indices are shared by all the lists in the vec ?

Yes. We assume so.

pdroalves

Reviewable status: 10 of 16 files reviewed, 23 unresolved discussions (waiting on @agnesLeroy, @IceTDrinker, and @tmontaigu)

tfhe/src/high_level_api/keys/server.rs line 299 at r4 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

just map should be enough ? given the else branch is returning a None ?

Totally!

- includes documentation about GPU's accelerated expand on the HL API

- Cloning the key is no longer necessary on the HL API

pdroalves

Reviewable status: 8 of 16 files reviewed, 18 unresolved discussions (waiting on @agnesLeroy, @IceTDrinker, and @tmontaigu)

tfhe/src/high_level_api/compact_list.rs line 496 at r4 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

clone ? 🤔

is this copying the whole key material ?

It took a lot of work, but I think I’ve found a solution. Today, @agnesLeroy pointed out that it makes no sense to have a new() method in CudaKeySwitchingKey, since we always convert an existing CPU key when creating a CUDA key. This simplified everything, and I was able to eliminate the cloning. I kept this specific change in a separate commit so you can review it carefully.

cla-bot bot added the cla-signed label Apr 16, 2025

pdroalves force-pushed the pa/feat/expand_hlapi branch 3 times, most recently from 130b2b4 to 8f76233 Compare April 17, 2025 12:49

pdroalves marked this pull request as ready for review April 17, 2025 12:49

pdroalves force-pushed the pa/feat/expand_hlapi branch 6 times, most recently from 86412ec to 3764ea3 Compare April 17, 2025 14:17

pdroalves requested review from IceTDrinker and agnesLeroy April 17, 2025 14:22

pdroalves force-pushed the pa/feat/expand_hlapi branch 5 times, most recently from 5a83bdc to 161093c Compare April 17, 2025 15:45

agnesLeroy requested changes Apr 18, 2025

View reviewed changes

IceTDrinker requested changes Apr 18, 2025

View reviewed changes

pdroalves commented Apr 22, 2025

View reviewed changes

pdroalves force-pushed the pa/feat/expand_hlapi branch from 161093c to 7d827d5 Compare April 22, 2025 12:52

pdroalves commented Apr 22, 2025

View reviewed changes

pdroalves force-pushed the pa/feat/expand_hlapi branch from b1283fb to f82e3c0 Compare April 22, 2025 13:35

pdroalves requested review from IceTDrinker and agnesLeroy April 22, 2025 13:52

agnesLeroy requested changes Apr 23, 2025

View reviewed changes

pdroalves force-pushed the pa/feat/expand_hlapi branch from f82e3c0 to 55edcfd Compare April 23, 2025 18:08

pdroalves commented Apr 23, 2025

View reviewed changes

IceTDrinker requested changes Apr 25, 2025

View reviewed changes

pdroalves force-pushed the pa/feat/expand_hlapi branch from 55edcfd to b2c9d05 Compare April 28, 2025 16:39

pdroalves requested a review from tmontaigu as a code owner April 28, 2025 16:39

pdroalves commented Apr 28, 2025

View reviewed changes

feat(gpu): add support for GPU-accelerated expand on the HL Api

f785425

- includes documentation about GPU's accelerated expand on the HL API

pdroalves force-pushed the pa/feat/expand_hlapi branch from b2c9d05 to f785425 Compare April 28, 2025 17:16

chore(gpu): rework CudaKeySwitchingKey

64a8cec

- Cloning the key is no longer necessary on the HL API

pdroalves commented Apr 30, 2025

View reviewed changes

pdroalves requested review from IceTDrinker and agnesLeroy April 30, 2025 18:13

pdroalves added the approved label May 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(gpu): add support for GPU-accelerated expand on the HL Api #2276

feat(gpu): add support for GPU-accelerated expand on the HL Api #2276

pdroalves commented Apr 16, 2025 •

edited by IceTDrinker

Loading

agnesLeroy left a comment

IceTDrinker commented Apr 18, 2025

IceTDrinker left a comment

agnesLeroy commented Apr 18, 2025

pdroalves left a comment

pdroalves left a comment

agnesLeroy left a comment

pdroalves left a comment

pdroalves left a comment

IceTDrinker left a comment

pdroalves left a comment

pdroalves left a comment

pdroalves left a comment

feat(gpu): add support for GPU-accelerated expand on the HL Api #2276

Are you sure you want to change the base?

feat(gpu): add support for GPU-accelerated expand on the HL Api #2276

Conversation

pdroalves commented Apr 16, 2025 • edited by IceTDrinker Loading

PR content/description

Check-list:

agnesLeroy left a comment

Choose a reason for hiding this comment

IceTDrinker commented Apr 18, 2025

IceTDrinker left a comment

Choose a reason for hiding this comment

agnesLeroy commented Apr 18, 2025

pdroalves left a comment

Choose a reason for hiding this comment

pdroalves left a comment

Choose a reason for hiding this comment

agnesLeroy left a comment

Choose a reason for hiding this comment

pdroalves left a comment

Choose a reason for hiding this comment

pdroalves left a comment

Choose a reason for hiding this comment

IceTDrinker left a comment

Choose a reason for hiding this comment

pdroalves left a comment

Choose a reason for hiding this comment

pdroalves left a comment

Choose a reason for hiding this comment

pdroalves left a comment

Choose a reason for hiding this comment

pdroalves commented Apr 16, 2025 •

edited by IceTDrinker

Loading