Please enable JavaScript.

Coggle requires JavaScript to display documents.

TEE, DPI Decryption/Encryption Scheme, Reference:, Y. Sun, D. Marino, S.K.…

- - - - Depending on the configuration, Logic Application can modify the decrypted data or perform other actions regarding the decrypted data
    - - Kernel Application
        
        Handles the decryption, the loading and matching of Sensitivity Policy, taking actions to filter sensitive data once there is match
      - Sensitivity Policy
        
        Set of rules to identify the sensitive data
      - Log
        
        TEE stores logs of the filtered decrypted data
        
        It also monitors the accessing of logs, who and when it was accessed this is to provide logon details if ever there is malicious insider
- - - - Makes the customers/users uncomfortable and wary of their privacy
  - - - Detection module will check first for suspicious keywords, then if there is a match, detection module will choose an action if it will drop the packet or will deep dive and decrypt network traffic
      - In the paper, it was mentioned using less strings in matching is essential in its performance so this might be a problem with current idea of workflow.
    - - BlindBox really protects the privacy of the users
    - - Blindbox performance in client-side relies on client's system performance as well
      - Requires users to follow the protocol by tokenizing the traffic, by sending initial salt
      - Limits the more than 1 keywords to use in string matching for performance purpose
  - - - Sender
        
        1) Encrypts traffic with SSL
        
        2) Tokenizes the traffic by splitting it into substrings
        
        3) Encrypts the result token using DPIEnc encryption scheme
      - Middlebox
        
        4) Receives traffic and encrypted tokens
        
        5) Detect module will search for strings to match
        
        6) If there is a string match, decrypts the traffic to further match the traffic
        
        7) If there is a final match, detection module will do an action
      - Receiver
        
        8) Receives the SSL traffic, no verification needed
      - At least one of the endpoints are honest
    - - Exact Match Privacy
        
        Def: MB can discover only substrings of the traffic that matches known attack keywords
        
        Protocol I
        
        Single match string
        
        Protocol II
        
        Multiple match string
        
        Process
        
        1) Sender attaches the offset in a stream where the keyword appeared to each encrypted token
        
        2) MB checks rule tree if the encrypted token exist
      - Probable Cause Privacy
        
        Def: MB can decrypt traffic if there is a match to the suspicious keyword.
        
        Note: Must be triggered infrequently as possible, 1 string per rule, for efficiency and less overhead
        
        This might be a problem e.g. for Snort Community, 33% of rules will be used vs Embark would use 11.3% so the probable decryption will be used for 58% of rules vs Embark will use 20.2% of its rules only.
        
        Protocol III
        
        Uses regex and scripts
        
        If a keyword matches from the traffic, MB can decrypt the traffic
        
        Process
        
        1) Embed kSSL into the encrypted tokens
        
        2) MB checks for keywords in the traffic
        
        3) Using kSSL MB will use it to encrypt its keyword and if the received encrypted token from the sender is the same, then the detection matches
        
        (Detailed algorithm please see page 9 section 5)
    - - Process
        
        1) Traffic should be tokenized. 8 bytes per token was implemented from the paper but it can be resize depending on the desired implementation
        
        2) Sender needs to generate tokens that could match keywords that start and end on a delimiter-based offsets
        
        3) Given a salt, and AES key, a random function will encrypt the token
        
        4) Reduce the size of the ciphertext to length of 5 bytes. This is to reduce bandwidth overhead.
        
        5) Monitor the counter table, this is to keep track how many tokens were already in the stream and since this is also basis of salt. The MB will just received the initial salt to generate its own ciphertext.
        
        6) Once MB, receives the initial salt, it will generate garbled output for its encrypted rules. It will then used this garbled output for matching the current received encrypted token.
        
        7) MB performs matching in logarithmic lookup to enhance performance (e.g. if ruleset with 1000 keywords to match, it is four orders of magnitude faster than linear scan)
  - - - Show that BlindBox can support the functionality of MBs
      - Shows the performance overheads of BlindBox in end to end process (endpoints and middblebox)
    - - Setup and Metrics
        
        Uses fraction of rules for Protocol I, Protocol II, Protocol III
        
        Uses public datasets for watermarking, parental filtering, IDS rules from Snort, and partial rules from Lastline and McAfee Stonesoft
        
        Token length of 2, 4, 8, 16, 32
        
        Uses ICTF2010 network trace, and a capture the flag contest
      - Results
        
        How many fraction of rules can be implemented?
        
        Protocol I can support 5% of the policies of IDS due to IDS requires more than 1 keyword for detection
        
        Protocol II can support 29-67% of policies for IDS
        
        Protocol III can support all applications by enabling decryption when there is probable cause
        
        How many fraction of rules with Protocol II can be used given minimum of token length?
        
        Number of rules decreases as token length increases but writers claimed that this will decrease false positive detection as well
        
        Detected 97.1% of the attack keywords and 99% of the attacks rules that would have been detected with Snot in the capture the flag.
    - - Goal
        
        Evaluate performance overheads at both client and the network
      - Setup and Metrics
        
        Client
        
        Throughput is 20Mbps
        
        2.60GHz processors connected to 10GbE link
        
        Multicore and use only 1 thread per client
        
        CPU supports AES-NI instructions
        
        Latency is up to 10ms RTT
        
        Middlebox
        
        Four 2.6GHz Xeon E5-2650 cores
        
        128HB RAM
        
        Uses microbenchmark for measure time uses average of 10,000 iterations and flow completion uses average of 5 runs
      - Results
        
        In middlebox-side, BlindBox performs more quickly because of reducing all detections to exact matching
        
        3 to 6 orders of magnitude faster than deployed middleboxes that uses decryption
        
        In client-side, cause of BlindBox overhead is because of the producing the garble output and also depends on the tokenization
        
        Takes 30x longer to encrypt than standard HTTPS in client side
        
        Can be mitigated with extra cores
  - - - Final Thoughts
        
        Disadvantages
        
        Still needs user's resources and implementation
        
        Advantages
        
        Unlike, BlindBox that uses symmetric encryption (AES), BlindIDS uses asymmetric encryption which is using private key and public key so that it no longer generates garble outputs to match the encrypted traffic
        
        BlindIDS really is better since it will just one time send encrypted rules for matching
        
        Enhanced version of BlindBox that is implemented in Intrusion Detection System (IDS)
      - Evaluation
        
        Performance Evaluation
        
        Result
        
        Memory space no longer depends on number of connections and number of tokens but only depends on the number of detection rules
        
        Detection time takes 25% faster than BlindBox
        
        Decryption time takes longer than BlindBox and standard HTTPS since it uses asymmetric encryption scheme
        
        Decrease the latency in setting up HTTPS connections by 3 orders of magnitude
        
        Decreased the overhead performance of BlindBox by 6 orders of magnitude since lesser memory space is needed
        
        Goal
        
        Shows the overhead in client-side and BlindIDS' scability
        
        Functional Evaluation
        
        Result
        
        In showing the detection capability, BlindIDS uses capture the flag as well to measure it. And it achieved 96.5% of the attack keywords and 98.3% of the attack that Snort should have detected. The result is around 1% lesser than the BlindBox achieved.
        
        In supporting the datasets, BlindIDS achieved 100% both in Malware Blocklists and URL Deny Lists, but for Yara rules and Snort rules it achieved 77.3% and 75% respectively. The decrease is due to full regex since it is inherently slow so it was omitted in supporting, but some regex were still included.
        
        Goal
        
        Show the ability to detect attacks using standard signature-based rules
        
        Setup
        
        Intel (R) Xeon (R) with E5-1620 CPU with 4 cores running at 3.70GHz under 64-bit Linux OS
        
        Dataset
        
        Snort Rules
        
        Yara rules
        
        Malware Blocklist
        
        URL Deny Lists
      - PROBLEM
        
        BlindBox is hard to deploy in real-world
        
        BlindBox requires cleartext format of the signature-based patterns
        
        BlindBox requires middleboxes (MBs) to encrypt entire set of malicious patterns to use in detecting network traffic.
        
        Needs large amount of memory space
        
        Which actually affects time for connection setup
        
        Aside from the usual challenges in inspecting HTTPS traffic of IDS
      - Written in Java
      - Architecture
        
        Decryptable Searchable Encryption
        
        Process Flow
        
        Dec
        
        Output: plaintext keyword
        
        Input: encrypted keyword, private key
        
        Test
        
        Output: Match or Not matched
        
        Input: encrypted keyword, encrypted trapdoor keyword
        
        TrapGen
        
        Output: encrypted trapdoor keyword
        
        Input: keyword, trapdoor key
        
        Enc
        
        Output: encrypted keyword, and public key
        
        Input: keyword, public key
        
        KeyGen
        
        Output: Public Key, Private Key, Trapdoor Key
        
        Input: Security Parameter
        
        Honest-but-Curious Entity
        
        Security Editor - the providers of rules for detecting malicious network traffic
        
        Service Provider - middleboxes that uses Security Editor's rules
        
        Overall Process Flow
        
        Receive
        
        6) Given the sender's public key and the encrypted traffic, it will decrypt the encrypted traffic into plain traffic
        
        Detect
        
        5) Given its public key, an encrypted traffic, and encrypted rules, SP will do a lookup and match if the encrypted rules match the encrypted traffic
        
        RuleGen
        
        4) Encrypts its rules and passes the encrypted rules to SP
        
        3) Generates a pair of its secret key, and SP's public key for SE's encrypted rules
        
        Send
        
        2) Sender will encrypt the token, and sends it along with receiver's public key
        
        Setup
        
        1) Sender will generate a private key for itself, and public key for the receiver
      - GOAL
        
        Security-aware supports DPI to inspect encrypted network traffic and at the same time protects privacy of signature patterns
        
        Privacy-friendly not only for client-side
        
        Achieve great performance and become deployable in real-world
        
        To achieve the three (3) security model properties
        
        Traffic Indistinguishability Property
        
        Even if some part of the traffic was obtained, adversary cannot distinguish the other parts of the traffic is connected to the obtained network traffic.
        
        SP will not know any information about the traffic
        
        Rule Indistinguishability Traffic
        
        SP will never learn any information about the rules
        
        Detection Property
        
        Any malicious traffic must be detected by Service Provider (SP)
- - - - 1) Client initiates SSL connection to Server
      - 2) TCP handshake established
      - 3) Generate session key and share it with the LOCKS registrar installed in the web browser
      - 4) Keys will be stored in LOCKS database and will be forwarded to the IDS using asymmetric encryption
      - 5) IDS will use this to decrypt and perform necessary actions
    - - Cooperative Key Sharing - which uses asymmetric encryption
      - Selective-Providing Keys - where in there is list of sensitive assets for users to be aware of the consequences of sending session keys from these assets
      - SSL Flow Regulation - when should decryption happens
        
        Firewall - drops encrypted network traffic until session keys are shared
        
        Only block encrypted traffic from malicious/suspicious domains but this causes delay since it will inspect the network traffic first by then the client already received the malicious traffic
        
        Inline monitoring will cause performance issue as well since it will check and send the packets at the same time but the advantage here is it can drop the packet the moment it was verified malicious
        
        Let all traffic through until the session key had been shared in the LOCKS database. This way IDS can decrypt data and stops the malicious traffic once verified. There will be a delay in sending but it guarantees faster response time if packets are to be dropped.
        
        Preferred by the authors
  - - - Show Communication Latency
        
        Setup and Metrics
        
        Measures the difference of the timestamps of SYN packet and ACK packet
        
        Ran 100 tests to download 100KB~10MB files via local web server and from a hosted web server and ran five (5) worst values as outliers
        
        Blue Coat IDS with MITM vs Blue Coat IDS with LOCKS
        
        Result
        
        LOCKS perform a little better than MITM but the differences are just little
      - Usability Tesing
        
        Setup and Metrics
        
        Provided questionnaire about how easy to setup LOCKS
        
        more than 68 is above average and less than 68 is below average
        
        Result
        
        LOCKS achieved 85.6%
      - Performance
        
        Setup and Metrics
        
        Bro IDS with SSL decryption enabled vs Bro IDS without SSL decryption enabled
        
        Machine running Squid Proxy and Bro, and a web server
        
        Uses traffic shaping tools in Linux for various bandwidth and ran multiple browser instances simultaneously
        
        Use capture loss facility to measure the effect of decryption
        
        Result
        
        LOCKS made Bro IDS packet loss rate increased than its usual
- - - - PROBLEM
        
        Deep Packet Inspection (DPI) is extensively used by security applications and heavily relied on therefore it becomes expensive when it comes to performance cost.
      - GOAL
        
        Find set of optimizations to lower the performance cost of DPI by finding fastest algorithm and best operating conditions without compromising the accuracy.
      - Overview of Suggested Improvements
        
        Since using regular expressions is expensive, implementing it in Deterministic Finite Automata (DFA), which means the computational cost depends only on the length of the input sequence. This way, DPI will not search further given the number of states it will produce, if it exceeded 100K then it will stop.
        
        Disadvantage
        
        Can cause memory explosions because of the number of states
        
        Limits the context-sensitive regular expressions
        
        Advantage
        
        Because DFA divides the signature into sequential state, its processing performance is better in four orders of magnitude than Non-deterministic Finite Automata (NFA)
        
        Use of Packet-based, Per-Flow State (PBFS) to analyze data on packet by packet basis instead of analyzing the payload in application layer.
        
        Check first the packet size if it exceed the content that the signature will match, and there is no need of checking the application-level protocol since the first few bytes of a packet session is already enough to classify its session.
        
        Stops inspecting after multiple classification attempts
        
        Disadvantage
        
        Decreases the possibility of random payload matches
      - Final Thoughts
        
        Finding a way to eliminate numerous packets in initial filters will really help in lowering performance cost
        
        Understanding the signature rules and the traffic necessary for the set of rules it searches for is also essential. It does not mean that inspecting a set of traffic had low performance cost because of DPIs capability but it can also depend on the set of traffic the environment produces. For example, if the set of traffic usually goes through is P2P but the signature rules used are mostly for HTTP so this kind of metrics is not strong evidence and we should still provide more indicator to support the performance evaluation..
      - Evaluation
        
        Setup
        
        Signature database of 17 filters
        
        Traffic are collected in two (2) Universities, POLITO-GT has more P2P and WebTV traffic. POLITO has more normal traffic generated by 6000 hosts. While UNIBS-GT, also has P2P traffic but has more normal traffic, generated by 20 hosts.
        
        Goal
        
        Show the performance improvement
        
        Show the accuracy in terms of percentage of unknown traffic
        
        DFA as Pattern Matching Algorithm
        
        DFA memory occupancy grows exponentially when regular expression used is complex regular expression (without wildcard * or anchored ^) but has lower execution cost
        
        Anchored (^ or *) is more prone to false positive but Kleene (complex regex) is costly in nonmatching cases
        
        Performance of DPI PBFS
        
        The cost of pattern matching depends on the packet size: 1 byte payload is the lowest cost and highest cost is 1460 bytes
        
        Classification of POLITO-GT, POLITO, and UNIBS-GT is different because of the sources. In POLITO-GT has highest percentage of unknown traffic due to large amount of traffic came from P2P while POLITO has lesser unknown traffic since the collected traffic came from known applications.
        
        DPI Optimizations
        
        UDP has less optimization cost because it has lesser packet size than TCP
        
        Protocols (e.g. telenet, direct connect++) were checked in a high number of packets due to poor signature classifications therefore limiting the classification attempt helps improve DPI performance