Please enable JavaScript.
Coggle requires JavaScript to display documents.
TEE, DPI Decryption/Encryption Scheme, Reference:, Y. Sun, D. Marino, S.K.…
TEE
Final Thoughts
It is also one of the innovations citing BlindBox, probably to have more freedom in inspecting the encrypted payloads the authors created an isolated environment for DPI to freely check detected traffic without compromising the security of customer's privacy.
No evaluation was included in this paper probably because it is a patent paper, but according to the process flow it seems like there will be a large amount of delay since after kernel application is done with inspecting, that is the only time the encrypted network traffic, either modified or original, will be forwarded to the receiver.
The idea of using non-transitory media in the TEE is great since it can somehow provide peace of mind that decrypted network traffic is not stored for a long time.
Architecture
Workflow
1) Trusted Execution Environment (TEE) will provide attestation to both sender and receiver that it is authentic.
2) Sender and receiver will establish encrypted network session between them
3) Sender will include symmetric session key in the encrypted network session
4) Kernel application will use the symmetric session key to decrypt encrypted network traffic
5) Kernel application will inspect decrypted network traffic using the Sensitivity Policy
6) Depending if the decrypted network traffic is sensitive it will redact the information and will forward the now filtered decrypted data
7) Logic Application will further process the filtered decrypted data
8) Kernel application will record the the details regarding the decrypted network traffic and can forward the logs to the sender or receiver for their auditing.
Inside the Middlebox
Logic Application
Depending on the configuration, Logic Application can modify the decrypted data or perform other actions regarding the decrypted data
Inside TEE
Kernel Application
Handles the decryption, the loading and matching of Sensitivity Policy, taking actions to filter sensitive data once there is match
Sensitivity Policy
Set of rules to identify the sensitive data
Log
TEE stores logs of the filtered decrypted data
It also monitors the accessing of logs, who and when it was accessed this is to provide logon details if ever there is malicious insider
PROBLEM
Allowing middleboxes to decrypt network traffic being vulnerable to exposing sensitive data, not only from attackers, but also malicious insiders like network administrators.
GOAL
To have a computer-implemented method wherein middlebox can decrypt an encrypted network traffic in a trusted environment and ensuring no sensitive data leakage outside the trusted environment
DPI Decryption/Encryption Scheme
BlindBox
PROBLEM
Available Middleboxes (MBs) compromises security of HTTPS since it uses man-in-the-middle attacks to decrypt the traffic
Makes the customers/users uncomfortable and wary of their privacy
Encryption protocols growing fast in recent years
Inspection of payloads for HTTPS has been challenging for over years
GOAL
ULTIMATE GOAL: Allow all MBs to coexist with encryption
Protect privacy of users from MBs but still enables MBs to inspect traffic
Striking quotes/lines
"One should give up privacy only if there is a reason for suspicion."
"Privacy is only affected with a cause."
Final Thoughts
Same idea of architecture workflow for the desired research topic
Detection module will check first for suspicious keywords, then if there is a match, detection module will choose an action if it will drop the packet or will deep dive and decrypt network traffic
In the paper, it was mentioned using less strings in matching is essential in its performance so this might be a problem with current idea of workflow.
Advantage
BlindBox really protects the privacy of the users
Disadvantages
Blindbox performance in client-side relies on client's system performance as well
Requires users to follow the protocol by tokenizing the traffic, by sending initial salt
Limits the more than 1 keywords to use in string matching for performance purpose
Architecture
Setup
Sender
1) Encrypts traffic with SSL
2) Tokenizes the traffic by splitting it into substrings
3) Encrypts the result token using DPIEnc encryption scheme
Middlebox
4) Receives traffic and encrypted tokens
5) Detect module will search for strings to match
6) If there is a string match, decrypts the traffic to further match the traffic
7) If there is a final match, detection module will do an action
Receiver
8) Receives the SSL traffic, no verification needed
At least one of the endpoints are honest
Privacy Models
Exact Match Privacy
Def: MB can discover only substrings of the traffic that matches known attack keywords
Protocol I
Single match string
Protocol II
Multiple match string
Process
1) Sender attaches the offset in a stream where the keyword appeared to each encrypted token
2) MB checks rule tree if the encrypted token exist
Probable Cause Privacy
Def: MB can decrypt traffic if there is a match to the suspicious keyword.
Note: Must be triggered infrequently as possible, 1 string per rule, for efficiency and less overhead
This might be a problem e.g. for Snort Community, 33% of rules will be used vs Embark would use 11.3% so the probable decryption will be used for 58% of rules vs Embark will use 20.2% of its rules only.
Protocol III
Uses regex and scripts
If a keyword matches from the traffic, MB can decrypt the traffic
Process
1) Embed kSSL into the encrypted tokens
2) MB checks for keywords in the traffic
3) Using kSSL MB will use it to encrypt its keyword and if the received encrypted token from the sender is the same, then the detection matches
(Detailed algorithm please see page 9 section 5)
DPIEnc Encryption Scheme
Process
1) Traffic should be tokenized. 8 bytes per token was implemented from the paper but it can be resize depending on the desired implementation
2) Sender needs to generate tokens that could match keywords that start and end on a delimiter-based offsets
3) Given a salt, and AES key, a random function will encrypt the token
4) Reduce the size of the ciphertext to length of 5 bytes. This is to reduce bandwidth overhead.
5) Monitor the counter table, this is to keep track how many tokens were already in the stream and since this is also basis of salt. The MB will just received the initial salt to generate its own ciphertext.
6) Once MB, receives the initial salt, it will generate garbled output for its encrypted rules. It will then used this garbled output for matching the current received encrypted token.
7) MB performs matching in logarithmic lookup to enhance performance (e.g. if ruleset with 1000 keywords to match, it is four orders of magnitude faster than linear scan)
Implemented in C
Evaluation
Goal
Show that BlindBox can support the functionality of MBs
Shows the performance overheads of BlindBox in end to end process (endpoints and middblebox)
Functionality Evaluation
Setup and Metrics
Uses fraction of rules for Protocol I, Protocol II, Protocol III
Uses public datasets for watermarking, parental filtering, IDS rules from Snort, and partial rules from Lastline and McAfee Stonesoft
Token length of 2, 4, 8, 16, 32
Uses ICTF2010 network trace, and a capture the flag contest
Results
How many fraction of rules can be implemented?
Protocol I can support 5% of the policies of IDS due to IDS requires more than 1 keyword for detection
Protocol II can support 29-67% of policies for IDS
Protocol III can support all applications by enabling decryption when there is probable cause
How many fraction of rules with Protocol II can be used given minimum of token length?
Number of rules decreases as token length increases but writers claimed that this will decrease false positive detection as well
Detected 97.1% of the attack keywords and 99% of the attacks rules that would have been detected with Snot in the capture the flag.
Performance Evaluation
Goal
Evaluate performance overheads at both client and the network
Setup and Metrics
Client
Throughput is 20Mbps
2.60GHz processors connected to 10GbE link
Multicore and use only 1 thread per client
CPU supports AES-NI instructions
Latency is up to 10ms RTT
Middlebox
Four 2.6GHz Xeon E5-2650 cores
128HB RAM
Uses microbenchmark for measure time uses average of 10,000 iterations and flow completion uses average of 5 runs
Results
In middlebox-side, BlindBox performs more quickly because of reducing all detections to exact matching
3 to 6 orders of magnitude faster than deployed middleboxes that uses decryption
In client-side, cause of BlindBox overhead is because of the producing the garble output and also depends on the tokenization
Takes 30x longer to encrypt than standard HTTPS in client side
Can be mitigated with extra cores
Enhanced version
BlindIDS
Final Thoughts
Disadvantages
Still needs user's resources and implementation
Advantages
Unlike, BlindBox that uses symmetric encryption (AES), BlindIDS uses asymmetric encryption which is using private key and public key so that it no longer generates garble outputs to match the encrypted traffic
BlindIDS really is better since it will just one time send encrypted rules for matching
Enhanced version of BlindBox that is implemented in Intrusion Detection System (IDS)
Evaluation
Performance Evaluation
Result
Memory space no longer depends on number of connections and number of tokens but only depends on the number of detection rules
Detection time takes 25% faster than BlindBox
Decryption time takes longer than BlindBox and standard HTTPS since it uses asymmetric encryption scheme
Decrease the latency in setting up HTTPS connections by 3 orders of magnitude
Decreased the overhead performance of BlindBox by 6 orders of magnitude since lesser memory space is needed
Goal
Shows the overhead in client-side and BlindIDS' scability
Functional Evaluation
Result
In showing the detection capability, BlindIDS uses capture the flag as well to measure it. And it achieved 96.5% of the attack keywords and 98.3% of the attack that Snort should have detected. The result is around 1% lesser than the BlindBox achieved.
In supporting the datasets, BlindIDS achieved 100% both in Malware Blocklists and URL Deny Lists, but for Yara rules and Snort rules it achieved 77.3% and 75% respectively. The decrease is due to full regex since it is inherently slow so it was omitted in supporting, but some regex were still included.
Goal
Show the ability to detect attacks using standard signature-based rules
Setup
Intel (R) Xeon (R) with E5-1620 CPU with 4 cores running at 3.70GHz under 64-bit Linux OS
Dataset
Snort Rules
Yara rules
Malware Blocklist
URL Deny Lists
PROBLEM
BlindBox is hard to deploy in real-world
BlindBox requires cleartext format of the signature-based patterns
BlindBox requires middleboxes (MBs) to encrypt entire set of malicious patterns to use in detecting network traffic.
Needs large amount of memory space
Which actually affects time for connection setup
Aside from the usual challenges in inspecting HTTPS traffic of IDS
Written in Java
Architecture
Decryptable Searchable Encryption
Process Flow
Dec
Output: plaintext keyword
Input: encrypted keyword, private key
Test
Output: Match or Not matched
Input: encrypted keyword, encrypted trapdoor keyword
TrapGen
Output: encrypted trapdoor keyword
Input: keyword, trapdoor key
Enc
Output: encrypted keyword, and public key
Input: keyword, public key
KeyGen
Output: Public Key, Private Key, Trapdoor Key
Input: Security Parameter
Honest-but-Curious Entity
Security Editor - the providers of rules for detecting malicious network traffic
Service Provider - middleboxes that uses Security Editor's rules
Overall Process Flow
Receive
6) Given the sender's public key and the encrypted traffic, it will decrypt the encrypted traffic into plain traffic
Detect
5) Given its public key, an encrypted traffic, and encrypted rules, SP will do a lookup and match if the encrypted rules match the encrypted traffic
RuleGen
4) Encrypts its rules and passes the encrypted rules to SP
3) Generates a pair of its secret key, and SP's public key for SE's encrypted rules
Send
2) Sender will encrypt the token, and sends it along with receiver's public key
Setup
1) Sender will generate a private key for itself, and public key for the receiver
GOAL
Security-aware supports DPI to inspect encrypted network traffic and at the same time protects privacy of signature patterns
Privacy-friendly not only for client-side
Achieve great performance and become deployable in real-world
To achieve the three (3) security model properties
Traffic Indistinguishability Property
Even if some part of the traffic was obtained, adversary cannot distinguish the other parts of the traffic is connected to the obtained network traffic.
SP will not know any information about the traffic
Rule Indistinguishability Traffic
SP will never learn any information about the rules
Detection Property
Any malicious traffic must be detected by Service Provider (SP)
Inspired by
LOCKS
Architecture
High-level Workflow
1) Client initiates SSL connection to Server
2) TCP handshake established
3) Generate session key and share it with the LOCKS registrar installed in the web browser
4) Keys will be stored in LOCKS database and will be forwarded to the IDS using asymmetric encryption
5) IDS will use this to decrypt and perform necessary actions
SSL Session Key Sharing
Cooperative Key Sharing - which uses asymmetric encryption
Selective-Providing Keys - where in there is list of sensitive assets for users to be aware of the consequences of sending session keys from these assets
SSL Flow Regulation - when should decryption happens
Firewall - drops encrypted network traffic until session keys are shared
Only block encrypted traffic from malicious/suspicious domains but this causes delay since it will inspect the network traffic first by then the client already received the malicious traffic
Inline monitoring will cause performance issue as well since it will check and send the packets at the same time but the advantage here is it can drop the packet the moment it was verified malicious
Let all traffic through until the session key had been shared in the LOCKS database. This way IDS can decrypt data and stops the malicious traffic once verified. There will be a delay in sending but it guarantees faster response time if packets are to be dropped.
Preferred by the authors
Evaluation
Goal
Show Communication Latency
Setup and Metrics
Measures the difference of the timestamps of SYN packet and ACK packet
Ran 100 tests to download 100KB~10MB files via local web server and from a hosted web server and ran five (5) worst values as outliers
Blue Coat IDS with MITM vs Blue Coat IDS with LOCKS
Result
LOCKS perform a little better than MITM but the differences are just little
Usability Tesing
Setup and Metrics
Provided questionnaire about how easy to setup LOCKS
more than 68 is above average and less than 68 is below average
Result
LOCKS achieved 85.6%
Performance
Setup and Metrics
Bro IDS with SSL decryption enabled vs Bro IDS without SSL decryption enabled
Machine running Squid Proxy and Bro, and a web server
Uses traffic shaping tools in Linux for various bandwidth and ran multiple browser instances simultaneously
Use capture loss facility to measure the effect of decryption
Result
LOCKS made Bro IDS packet loss rate increased than its usual
Striking Lines
"These protocols should be thoroughly analyzed by the security community before they are safely deployed in an operation setting."
GOAL
SSL connection must be completely intact but still able to check the payload of the network traffic
Final Thoughts
Did not share what kind of asymmetric LOCKS uses in sending the session key to IDS
Requires custom version of SSL Library to support LOCKS and needs to be updated for every browser's released
Seems like LOCKS method of sharing session key from browser to escrow server is used in threat analysis
Decrypting Network Traffic in IDS
Optimization
Optimizing DPI for High-Speed Traffic Analysis
PROBLEM
Deep Packet Inspection (DPI) is extensively used by security applications and heavily relied on therefore it becomes expensive when it comes to performance cost.
GOAL
Find set of optimizations to lower the performance cost of DPI by finding fastest algorithm and best operating conditions without compromising the accuracy.
Overview of Suggested Improvements
Since using regular expressions is expensive, implementing it in Deterministic Finite Automata (DFA), which means the computational cost depends only on the length of the input sequence. This way, DPI will not search further given the number of states it will produce, if it exceeded 100K then it will stop.
Disadvantage
Can cause memory explosions because of the number of states
Limits the context-sensitive regular expressions
Advantage
Because DFA divides the signature into sequential state, its processing performance is better in four orders of magnitude than Non-deterministic Finite Automata (NFA)
Use of Packet-based, Per-Flow State (PBFS) to analyze data on packet by packet basis instead of analyzing the payload in application layer.
Check first the packet size if it exceed the content that the signature will match, and there is no need of checking the application-level protocol since the first few bytes of a packet session is already enough to classify its session.
Stops inspecting after multiple classification attempts
Disadvantage
Decreases the possibility of random payload matches
Final Thoughts
Finding a way to eliminate numerous packets in initial filters will really help in lowering performance cost
Understanding the signature rules and the traffic necessary for the set of rules it searches for is also essential. It does not mean that inspecting a set of traffic had low performance cost because of DPIs capability but it can also depend on the set of traffic the environment produces. For example, if the set of traffic usually goes through is P2P but the signature rules used are mostly for HTTP so this kind of metrics is not strong evidence and we should still provide more indicator to support the performance evaluation..
Evaluation
Setup
Signature database of 17 filters
Traffic are collected in two (2) Universities, POLITO-GT has more P2P and WebTV traffic. POLITO has more normal traffic generated by 6000 hosts. While UNIBS-GT, also has P2P traffic but has more normal traffic, generated by 20 hosts.
Goal
Show the performance improvement
Show the accuracy in terms of percentage of unknown traffic
DFA as Pattern Matching Algorithm
DFA memory occupancy grows exponentially when regular expression used is complex regular expression (without wildcard * or anchored ^) but has lower execution cost
Anchored (^ or *) is more prone to false positive but Kleene (complex regex) is costly in nonmatching cases
Performance of DPI PBFS
The cost of pattern matching depends on the packet size: 1 byte payload is the lowest cost and highest cost is 1460 bytes
Classification of POLITO-GT, POLITO, and UNIBS-GT is different because of the sources. In POLITO-GT has highest percentage of unknown traffic due to large amount of traffic came from P2P while POLITO has lesser unknown traffic since the collected traffic came from known applications.
DPI Optimizations
UDP has less optimization cost because it has lesser packet size than TCP
Protocols (e.g. telenet, direct connect++) were checked in a high number of packets due to poor signature classifications therefore limiting the classification attempt helps improve DPI performance
Reference:
Y. Sun, D. Marino, S.K. Nanda, S. Shintra, B.T. Witten, R.A. Frederick, Q. Li, "Decrypting Network Traffic on a Middlebox Device Using a Trusted Execution Environment", Symantec Corp, US10044681B1, February 2018.
References:
J. Sherry, C. Lan, R.A. Popa, S. Ratnasamy, "BlindBox: Deep Packet Inspection over Encrypted Traffic", In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, 2015.
S. Canard, A. Diop, N. Kheir, M. Paindavoine, M. Sabt, "BlindIDS: Market-Compliant and Privacy-Friendly Intrusion Detection System Over Encrypted Traffic", In Proceedings of the 2017 ACM Conference Computer and Communications Security, 2017.
N. Cascarano, L. Ciminiera, F. Risso, "Optimizing Deep Packet Inspection for High-Speed Traffic Analysis", Journal of Network and System Management, 2011.
M. Bierma, A. Brown, T.M. Kroeger, H. Poston, T. Delano, "Locally Operated Cooperative Key Sharing (LOCKS)", In International Conference on Computing Network and Communications, 2017.
Y. Sun, D. Marino, S.K. Nanda, S. Shintre, B.T. Witten, R.A. Frederick, Q. Li, "Decrypting Network Traffic on a Middlebox Device Using a Trusted Execution Environment", Symantec Corp, US10044691B1, February 2018.