Pyfhel icon indicating copy to clipboard operation
Pyfhel copied to clipboard

How to Analyze Communication Cost?

Open ShokofehVS opened this issue 3 years ago • 5 comments

Describe the feature

I would like to ask about how we can analyze the communication cost or more precisely the size of communicated ciphertexts in bytes (single scalar, and vactor) that are sent to the data owner from cloud service providers in an interactive workflow.

Sample code:

def _calculate_msr(self, data, rows, cols, HE, t_enc, t_dec):
        # sub_data = data[rows][:, cols]
        sub_data = np.ascontiguousarray(sub_data)
        enc_sub_data = sub_data.flatten()
        arr_sub_data = np.empty(len(enc_sub_data), dtype=PyCtxt)
        for i in np.arange(len(enc_sub_data)):
            arr_sub_data[i] = HE.encryptFrac(enc_sub_data[i])

        arr_sub_data = arr_sub_data.reshape(sub_data.shape)

        # Encrypting data_mean
        enc_len_array_sub_data = HE.encodeFrac(1 / len(arr_sub_data))
        enc_data_mean = np.sum(arr_sub_data) * enc_len_array_sub_data 

        # Encrypting row_means
        enc_row_means = np.sum(arr_sub_data, axis=1) * enc_len_array_sub_data  
        enc_row_means = enc_row_means.reshape((sub_data.shape[0], 1))

        # Encrypting col_means
        enc_col_means = np.mean(arr_sub_data, axis=0)

        # Encrypting Residues
        enc_residues = arr_sub_data - enc_row_means - enc_col_means + enc_data_mean
        
        # Encrypting Squared Residues
        enc_squared_residues = enc_residues ** 2

        # Encrypting msr (single scalar)
        enc_len_squared_residue = HE.encodeFrac(1 / len(enc_squared_residues))
        enc_msr = np.sum(enc_squared_residues) * enc_len_squared_residue 
   
        # Encrypting row_msr  (vector)
         enc_row_msr = np.sum(enc_squared_residues, axis=1) * enc_len_squared_residue 
    
        # Encrypting col_msr  (vector)
        enc_col_msr = np.mean(enc_squared_residues, axis=0)

        #  Decrypting msr 
        decrypted_msr = HE.decryptFrac(enc_msr)
    
        # Decrypting msr_row
        decrypted_msr_row = np.empty(len(enc_row_msr), dtype=PyCtxt)
        for i in np.arange(len(enc_row_msr)):
            decrypted_msr_row[i] = HE.decryptFrac(enc_row_msr[i])
        
        # Decrypting msr_col
        decrypted_msr_col = np.empty(len(enc_col_msr), dtype=PyCtxt)
        for i in np.arange(len(enc_col_msr)):
            decrypted_msr_col[i] = HE.decryptFrac(enc_col_msr[i])

  
        return decrypted_msr, decrypted_msr_row, decrypted_msr_col

(Input data [https://arep.med.harvard.edu/biclustering/yeast.matrix]): matrix containing rows (2884) * columns (17)

ShokofehVS avatar Aug 01 '22 12:08 ShokofehVS

The best way to analyze communication costs would probably to use SEAL's serialization feature. Either by actually writing the ciphertext to a file, or by using something like save_size (see https://github.com/microsoft/SEAL/blob/main/native/src/seal/ciphertext.h#L458-L466).

iirc, Pyfhel has serialization support, but I assume we don't expose the save_size function.

AlexanderViand avatar Aug 05 '22 09:08 AlexanderViand

If needed, save_size could be easily exposed for individual PyCtxt objects.

ibarrond avatar Aug 09 '22 18:08 ibarrond

AttributeError: 'Pyfhel.PyCtxt.PyCtxt' object has no attribute 'save_size' ( Pyfhel 2.3.1)

If needed, save_size could be easily exposed for individual PyCtxt objects.

ShokofehVS avatar Aug 14 '22 15:08 ShokofehVS

I said it could! Currently it is not implemented, but I can do it if you deem it necessary

On Sun, 14 Aug 2022, 17:53 Shokofeh VahidianSadegh, < @.***> wrote:

AttributeError: 'Pyfhel.PyCtxt.PyCtxt' object has no attribute 'save_size' ( Pyfhel 2.3.1)

If needed, save_size could be easily exposed for individual PyCtxt objects.

— Reply to this email directly, view it on GitHub https://github.com/ibarrond/Pyfhel/issues/134#issuecomment-1214404904, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADF6Z5YEF5CDZYKBC24FTZTVZEI75ANCNFSM55HKS3WA . You are receiving this because you commented.Message ID: @.***>

ibarrond avatar Aug 14 '22 16:08 ibarrond

I said it could! Currently it is not implemented, but I can do it if you deem it necessary On Sun, 14 Aug 2022, 17:53 Shokofeh VahidianSadegh, < @.> wrote: AttributeError: 'Pyfhel.PyCtxt.PyCtxt' object has no attribute 'save_size' ( Pyfhel 2.3.1) If needed, save_size could be easily exposed for individual PyCtxt objects. — Reply to this email directly, view it on GitHub <#134 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADF6Z5YEF5CDZYKBC24FTZTVZEI75ANCNFSM55HKS3WA . You are receiving this because you commented.Message ID: @.>

Ok, I understood. yes, that's good but I assume that would be a new feature for a current stable version which I am not working with at the moment. One approach that I take to do so is writing ciphertext objects individually in a file and then estimating the total size of the file.

ShokofehVS avatar Aug 14 '22 17:08 ShokofehVS

As per 5e0a456 (v3.3.0), all the save and load functions return the bytesize of the serialized/loaded object. You can use this to directly get the serialized object sizes and thus measure the communication cost.

Closing as completed!

ibarrond avatar Oct 04 '22 21:10 ibarrond

Hi @ibarrond

I came back again to analyze the communication size :)

In my implementation, there are a number of iterations ranging from 330 (by setting up different parameters for better performance). I tried to get the size of communicated ciphertext by (ciphertext.save("file_name.txt", "zlib")). After all, I have an array containing these sizes for a specific ciphertext.

To elaborate on this cost, I don't know what would be best to get from this array (max, mean or average). I would like to ask whether you can help me in this regard?

Thanks Alberto!

ShokofehVS avatar Jan 23 '24 07:01 ShokofehVS

I think the average per round and total communication are the only metrics that matter. And I agree on the method to calculate them via the bytesize of the ciphertexts compressed with zlib.

ibarrond avatar Jan 25 '24 18:01 ibarrond