Week 6: Secure Multiparty Computation (SMC)
🎯 Learning Goals
By the end of this week, you should understand:
- The fundamentals of Secure Multiparty Computation (SMC)
- Key SMC protocols: Yao’s Garbled Circuits, BGW, GMW
- Privacy guarantees and security models in SMC
- Practical applications and limitations of SMC systems
📖 Theoretical Content
Introduction to Secure Multiparty Computation
Definition: Secure Multiparty Computation enables multiple parties to jointly compute a function over their inputs while keeping those inputs private.
Classical Example - Yao’s Millionaire Problem: Two millionaires want to determine who is richer without revealing their exact wealth.
- Alice has wealth A, Bob has wealth B
- Goal: Compute A > B without revealing A or B
- Neither party learns anything beyond the comparison result
Formal Requirements:
- Correctness: Output equals the result of the ideal computation
- Privacy: No party learns more than what can be inferred from their input and output
- Independence of Inputs: Parties cannot choose inputs based on others’ inputs
- Guaranteed Output Delivery: Honest parties receive correct output
- Fairness: Either all parties get output or none do
Security Models
1. Semi-Honest (Honest-but-Curious) Model
- Parties follow the protocol correctly
- But may try to learn extra information from messages
- Easier to achieve, many practical protocols
- Suitable when participants have incentives to cooperate
2. Malicious (Byzantine) Model
- Parties may deviate arbitrarily from protocol
- May provide false inputs or corrupt messages
- Stronger security guarantees
- More complex and expensive protocols
3. Covert Model
- Intermediate between semi-honest and malicious
- Cheating is detected with some probability
- Practical middle ground for many applications
Threshold Security
t-out-of-n Security:
- Protocol secure as long as at most t parties are corrupted
- Common thresholds:
- Honest Majority: t < n/2 (most efficient)
- Dishonest Majority: t < n (requires stronger assumptions)
Circuit Representation
Boolean Circuits: Functions represented as networks of AND, OR, and NOT gates
- Addition: Full adder circuits
- Comparison: Comparator circuits
- Multiplication: More complex, quadratic circuit size
Arithmetic Circuits: Functions over finite fields or rings
- Addition: Simple field addition
- Multiplication: More expensive operation
- Suitable for numerical computations
🔍 Detailed Explanations
Yao’s Garbled Circuits Protocol
High-Level Idea:
- Garbler (Alice) creates “garbled” version of circuit
- Evaluator (Bob) evaluates garbled circuit
- Result revealed without exposing intermediate values
Garbling Process:
- Wire Labels: Each wire gets two random labels (one for 0, one for 1)
- Gate Tables: For each gate, create encrypted truth table
- Encryption: Use wire labels as keys for encryption
Example - AND Gate:
Truth Table: Garbled Table (encrypted):
a | b | out Enc(k_a0, k_b0, k_out0) [for 0 AND 0 = 0]
0 | 0 | 0 Enc(k_a0, k_b1, k_out0) [for 0 AND 1 = 0]
0 | 1 | 0 Enc(k_a1, k_b0, k_out0) [for 1 AND 0 = 0]
1 | 0 | 0 Enc(k_a1, k_b1, k_out1) [for 1 AND 1 = 1]
1 | 1 | 1
Evaluation Process:
- Bob receives garbled circuit from Alice
- Bob obtains wire labels for his input via Oblivious Transfer
- Bob evaluates circuit gate by gate
- Final output wire labels reveal computation result
Secret Sharing
Shamir’s Secret Sharing: Split secret s into n shares such that any t shares can reconstruct s
Polynomial Construction:
- Choose random polynomial f(x) of degree t-1
- Set f(0) = s (the secret)
- Shares are f(1), f(2), …, f(n)
- Any t points determine unique polynomial of degree t-1
BGW Protocol (Ben-Or, Goldwasser, Wigderson):
- Based on secret sharing over finite fields
- Addition: Add corresponding shares
- Multiplication: More complex, requires interaction
- Secure against t < n/3 malicious adversaries
GMW Protocol (Goldreich, Micali, Wigderson)
Share-based approach for Boolean circuits:
- Secret sharing: Use XOR-based sharing (s = s₁ ⊕ s₂ ⊕ … ⊕ sₙ)
- XOR gates: Free computation (XOR shares)
- AND gates: Require secure multiplication protocol
- Communication: Interaction needed for AND gates only
Advantages:
- Simple share structure (XOR-based)
- Efficient for circuits with many XOR operations
- Generalizes to any number of parties
- Secure against t < n/2 semi-honest adversaries
Disadvantages:
- Communication-intensive for AND-heavy circuits
- Round complexity proportional to circuit depth
💡 Practical Examples
Example 1: Private Set Intersection (PSI)
Scenario: Two companies want to find common customers without revealing their entire customer lists.
Setup:
- Company A has customer set S_A = {Alice, Bob, Charlie}
- Company B has customer set S_B = {Bob, Charlie, David}
- Goal: Find S_A ∩ S_B = {Bob, Charlie}
SMC Solution Using Circuit:
def psi_circuit(set_a, set_b):
intersection = []
for item_a in set_a:
for item_b in set_b:
if secure_equal(item_a, item_b): # SMC equality test
intersection.append(item_a)
return intersection
Privacy Guarantees:
- Neither company learns about customers not in intersection
- Only mutual customers are revealed
- No information about set sizes (with proper padding)
Example 2: Private Bidding Auction
Scenario: Sealed-bid auction where highest bidder wins, but losing bids remain private.
Participants:
- Bidder 1: $100
- Bidder 2: $150
- Bidder 3: $120
SMC Protocol:
- Input Phase: Each bidder secret-shares their bid
- Computation Phase: Compute maximum using comparison circuits
- Output Phase: Reveal winner and winning bid only
Comparison Circuit for Two Values:
Function: max(a, b)
1. Compute comparison: c = (a > b)
2. If c = 1, return a; else return b
3. Circuit depth: O(log(bit_length))
Benefits:
- Losing bidders’ privacy protected
- Prevents bid manipulation based on others’ bids
- Ensures fair auction process
Example 3: Private Statistical Analysis
Scenario: Multiple hospitals want to compute average patient age for research without sharing individual patient data.
Data:
- Hospital A: Patients aged [25, 30, 35, 40]
- Hospital B: Patients aged [28, 33, 38]
- Hospital C: Patients aged [22, 27, 32, 37, 42]
SMC Computation:
def secure_average(datasets):
# Step 1: Compute total sum
total_sum = 0
for dataset in datasets:
for age in dataset:
total_sum += secure_add(age) # SMC addition
# Step 2: Compute total count
total_count = 0
for dataset in datasets:
total_count += secure_add(len(dataset))
# Step 3: Secure division
return secure_divide(total_sum, total_count)
Result: Average age ≈ 32.1 years (computed securely)
Privacy Benefits:
- Individual patient ages never revealed
- Hospital-specific statistics remain private
- Only aggregate result is disclosed
❓ Self-Assessment Questions
Question 1: Explain the difference between semi-honest and malicious security models in SMC. When would you use each? (Click to reveal answer)
**Answer:** **Semi-Honest (Honest-but-Curious) Model:** - **Behavior:** Parties follow protocol correctly but try to learn extra information - **Assumptions:** Participants have incentive to cooperate (e.g., mutual benefit) - **Security:** Protects against passive eavesdropping on protocol execution - **Efficiency:** Generally more efficient protocols - **Use Cases:** Research collaborations, industry consortiums with aligned interests **Malicious (Byzantine) Model:** - **Behavior:** Parties may deviate arbitrarily from protocol - **Assumptions:** No trust between participants, potential adversarial behavior - **Security:** Protects against active attacks, input manipulation, protocol corruption - **Efficiency:** More complex and expensive protocols - **Use Cases:** Financial transactions, legal disputes, competitive business scenarios **When to Use Each:** **Semi-Honest Appropriate:** - Medical research collaboration between hospitals - Academic institutions sharing data for research - Companies in non-competitive alliance - Situations with legal/contractual enforcement **Malicious Required:** - Financial auctions and trading - Voting systems - Competitive business intelligence - Legal proceedings requiring high assurance - Cryptocurrency and blockchain applications **Practical Consideration:** Many real-world systems use semi-honest protocols with additional mechanisms (reputation, auditing, legal agreements) to discourage malicious behavior.Question 2: In Yao's Garbled Circuits, why can't the evaluator learn anything about the garbler's input from the garbled circuit? (Click to reveal answer)
**Answer:** **Cryptographic Protection Mechanisms:** **1. Random Wire Labels:** - Each wire has two random labels (for 0 and 1) - Labels are computationally indistinguishable from random - Evaluator cannot determine which label corresponds to which bit value **2. Encrypted Gate Tables:** - Truth table entries are encrypted using wire labels as keys - Evaluator can only decrypt one entry per gate (corresponding to actual computation path) - Cannot decrypt other entries to learn alternative input combinations **3. Point-and-Permute Optimization:** - Wire labels include pointer bits indicating correct ciphertext - Prevents evaluator from trying all decryptions - Maintains security while improving efficiency **4. Oblivious Transfer for Input:** - Evaluator receives only wire labels corresponding to their actual input bits - Cannot obtain labels for other possible inputs - Garbler doesn't learn evaluator's input choice **Example:** ``` Garbler's input: a = 1 Evaluator's input: b = 0 Evaluator receives: - Label for a=1 (but doesn't know it represents 1) - Label for b=0 (corresponding to their input) - Can only decrypt gate table entry for (1,0) combination - Cannot decrypt entries for (0,0), (0,1), or (1,1) ``` **Information-Theoretic Perspective:** From evaluator's view, the garbled circuit is consistent with any possible input from the garbler, providing perfect privacy for the garbler's input.Question 3: Design an SMC protocol for the following scenario: Three friends want to find out who should pay for dinner (person with highest salary) without revealing their actual salaries. (Click to reveal answer)
**Answer:** **Scenario:** Alice, Bob, and Charlie need to determine who has the highest salary for dinner payment without revealing actual amounts. **Protocol Design:** **Step 1: Input Preparation** ```python # Each person has salary: Alice (A), Bob (B), Charlie (C) # Convert to binary representation (assume 32-bit salaries) A_bits = binary_representation(alice_salary) B_bits = binary_representation(bob_salary) C_bits = binary_representation(charlie_salary) ``` **Step 2: Secret Sharing (Using Shamir's Scheme)** ```python # Each person secret-shares their salary alice_shares = secret_share(alice_salary, threshold=2, parties=3) bob_shares = secret_share(bob_salary, threshold=2, parties=3) charlie_shares = secret_share(charlie_salary, threshold=2, parties=3) # Distribute shares # Alice gets: alice_shares[0], bob_shares[0], charlie_shares[0] # Bob gets: alice_shares[1], bob_shares[1], charlie_shares[1] # Charlie gets: alice_shares[2], bob_shares[2], charlie_shares[2] ``` **Step 3: Secure Comparison Circuit** ```python def secure_max_of_three(a_shares, b_shares, c_shares): # Compare A vs B ab_comparison = secure_comparison(a_shares, b_shares) # 1 if A>B, 0 otherwise max_ab = secure_conditional_select(ab_comparison, a_shares, b_shares) winner_ab = secure_conditional_select(ab_comparison, "Alice", "Bob") # Compare max(A,B) vs C max_abc_comparison = secure_comparison(max_ab, c_shares) final_winner = secure_conditional_select(max_abc_comparison, winner_ab, "Charlie") return final_winner ``` **Step 4: Protocol Execution** 1. **Round 1:** Exchange secret shares 2. **Round 2:** Jointly evaluate comparison circuit 3. **Round 3:** Reconstruct only the winner's identity **Alternative Approaches:** **Approach 1: Yao's Garbled Circuits** - Alice creates garbled circuit for 3-way comparison - Bob and Charlie use Oblivious Transfer for inputs - Output: Winner's name only **Approach 2: Additive Secret Sharing** ```python # Each salary split as: salary = share1 + share2 + share3 # Comparison done on shares without reconstruction def compare_shared_values(shares_a, shares_b): # Compute shares_a - shares_b diff_shares = [a - b for a, b in zip(shares_a, shares_b)] # Secure sign test on difference return secure_sign_test(diff_shares) ``` **Privacy Properties:** - No one learns actual salary amounts - Only winner's identity is revealed - Protocol secure against any single corrupt party - Can be extended to handle ties (split payment) **Optimizations:** - Use approximation for efficiency (compare salary ranges) - Batch multiple dinner decisions - Add noise for differential privacyQuestion 4: What are the main performance bottlenecks in SMC protocols, and how do modern systems address them? (Click to reveal answer)
**Answer:** **Main Performance Bottlenecks:** **1. Communication Complexity** - **Problem:** SMC requires extensive message exchange between parties - **Garbled Circuits:** O(|C|) communication where |C| is circuit size - **Secret Sharing:** Multiplication gates require interaction rounds **2. Computational Overhead** - **Problem:** Cryptographic operations are expensive - **Garbling:** Creating encrypted truth tables for each gate - **Oblivious Transfer:** Expensive for input sharing **3. Round Complexity** - **Problem:** Sequential gates create communication rounds - **Circuit Depth:** Deep circuits require many sequential rounds - **Latency:** Network delays amplified by round count **4. Circuit Size** - **Problem:** Boolean circuits can be very large - **Multiplication:** Quadratic blowup for arithmetic operations - **Comparison:** Logarithmic depth but wide circuits **Modern Solutions:** **1. Communication Optimizations** ```python # Preprocessing Phase (offline) def offline_preprocessing(): # Generate random multiplication triples # Create garbled circuits in advance # Prepare OT correlations pass # Online Phase (fast) def online_computation(inputs): # Use preprocessed materials # Minimal communication # Fast execution pass ``` **2. Algorithmic Improvements** - **Free XOR:** XOR gates computed without communication - **Half Gates:** Reduce garbled circuit size by 50% - **Batching:** Amortize costs across multiple computations **3. Hardware Acceleration** ```python # GPU acceleration for parallel operations class SMCAccelerator: def __init__(self): self.gpu_context = initialize_gpu() def parallel_garbling(self, circuit): # Garble multiple gates simultaneously return gpu_parallel_encrypt(circuit.gates) def batch_ot(self, choices): # Batch oblivious transfers return gpu_batch_ot(choices) ``` **4. Protocol-Specific Optimizations** **Garbled Circuits:** - **Pipeline:** Overlap garbling, transfer, and evaluation - **Streaming:** Process circuit in chunks - **Compression:** Reduce garbled table sizes **Secret Sharing:** - **Packed Secret Sharing:** Multiple secrets per sharing - **BGV/CKKS:** Homomorphic encryption for arithmetic - **Replicated Sharing:** Honest majority protocols **5. Hybrid Approaches** ```python def hybrid_computation(function): if is_arithmetic_heavy(function): return secret_sharing_protocol(function) elif is_boolean_heavy(function): return garbled_circuits_protocol(function) else: return mixed_protocol(function) ``` **Performance Metrics:** - **Latency:** Sub-second for simple computations - **Throughput:** Thousands of operations per second - **Scalability:** Hundreds of parties in some protocols **Real-World Performance:** - **Simple comparisons:** ~1ms - **Private set intersection:** ~seconds for thousands of items - **Machine learning inference:** ~minutes for neural networksQuestion 5: Explain the trade-offs between different SMC approaches for implementing private machine learning. (Click to reveal answer)
**Answer:** **SMC Approaches for Private ML:** **1. Garbled Circuits** ```python # Neural network as boolean circuit def nn_garbled_circuit(weights, inputs): # Convert floating point to fixed point # Implement each layer as circuit # ReLU activation as comparison + selection pass ``` **Advantages:** - Constant rounds (low latency) - Good for small, deep networks - Handles non-linear activations well **Disadvantages:** - Expensive for large networks - Fixed-point arithmetic limitations - Circuit size grows with precision **2. Homomorphic Encryption** ```python # Encrypted computation on encrypted data def nn_homomorphic(encrypted_weights, encrypted_inputs): # Linear operations: direct computation # Non-linear: approximation with polynomials return encrypted_prediction ``` **Advantages:** - Minimal communication - Good for linear operations (convolutions) - Supports high precision **Disadvantages:** - Limited operations (mainly addition/multiplication) - Polynomial approximations for activations - High computational overhead **3. Secret Sharing (SPDZ/ABY³)** ```python # Arithmetic sharing for neural networks def nn_secret_sharing(shared_weights, shared_inputs): # Matrix multiplication: efficient # ReLU: requires comparison protocol return shared_prediction ``` **Advantages:** - Efficient arithmetic operations - Scales to large networks - Good throughput **Disadvantages:** - Many communication rounds - Expensive non-linear operations - Requires honest majority (some protocols) **4. Hybrid Approaches** ```python def hybrid_nn_protocol(layers): for layer in layers: if layer.type == "linear": result = secret_sharing_compute(layer) elif layer.type == "activation": result = garbled_circuits_compute(layer) elif layer.type == "pooling": result = homomorphic_compute(layer) return result ``` **Trade-off Analysis:** **Latency vs Throughput:** - **Garbled Circuits:** Low latency, low throughput - **Secret Sharing:** Higher latency, high throughput - **Homomorphic:** Variable latency, medium throughput **Network Size:** - **Small networks (<1000 parameters):** Garbled circuits preferred - **Medium networks (1K-1M parameters):** Hybrid approaches - **Large networks (>1M parameters):** Secret sharing or specialized protocols **Precision Requirements:** - **Low precision (8-16 bits):** All approaches viable - **High precision (32+ bits):** Homomorphic encryption or high-precision secret sharing **Security Models:** - **Semi-honest:** All approaches available - **Malicious:** Limited options, typically secret sharing with verification **Practical Considerations:** **Training vs Inference:** ```python # Training: requires backward pass def private_training(): # Forward pass + backward pass # Gradient computation and updates # More complex, typically secret sharing pass # Inference: forward pass only def private_inference(): # Single forward pass # Can use garbled circuits for low latency pass ``` **Real-World Performance:** - **Image classification (ResNet-18):** ~1-10 seconds per image - **Linear regression:** ~100ms per prediction - **Deep neural networks:** Minutes for training epochs **Emerging Trends:** - **Specialized hardware:** Crypto-processors for SMC - **Approximate computation:** Trade accuracy for speed - **Federated learning integration:** Combine with distributed training📚 Additional Resources
Foundational Papers
- Yao, A. C. (1982). “Protocols for secure computations”
- Ben-Or, M. et al. (1988). “Completeness theorems for non-cryptographic fault-tolerant distributed computation”
- Goldreich, O. et al. (1987). “How to play any mental game”
Modern Protocols
- Damgård, I. et al. (2012). “Multiparty computation from somewhat homomorphic encryption”
- Zahur, S. et al. (2015). “Two halves make a whole: Reducing data transfer in garbled circuits”
Practical Systems
- SCALE-MAMBA: Large-scale SMC platform
- ABY Framework: Mixed protocol SMC
- MP-SPDZ: Versatile SMC compiler
- EMP Toolkit: Efficient secure computation
Applications
- Private machine learning: CrypTFlow, SecureML
- Private database queries: PIR systems
- Secure auctions: Financial trading platforms