Thursday, February 13, 2020

MongoDB 4.2 Hybrid Index Build

Earlier -
  Foreground Index Build - Most performant but locks entire database for duration of index build. No reads/writes are permitted.
  Background Index Build - Non-performant. Incremental approach. Periodically locks database, but yields to incoming read/write operations. If the index is larger than available RAM, background index can take much longer than foreground index.
Another downside is that index structure resulting from background build is worse than the index structure resulting from a foreground build.

Hybrid Index Build - Best of both worlds. Performance of foreground index build and the non-locking property of background index build.
Index structure remains unchanged.

Under the hood

Every data collection in WiredTiger is called Table. All collection files, index files in the db path are supported in WT by table objects.

collection-4--7758868473387840549.wt
collection-6--6388518888314681728.wt
index-1--6388518888314681728.wt
index-1--7758868473387840549.wt

Aside from clearly identifiable collection and index table files, there are some internal tables used by MongoDB to write index keys during index build and a temporary WT table that is used to accommodate some writes that need to be staged before being inserted in the expected collection or index table.

1. Take exclusive lock on the collection and create 2 temporary tables for index creation.
    These are visible in the dbPath for the duration of the index build.
2. Remove the exclusive lock and apply a weaker lock on the collection.
3. Start collection scan.
  - While doing collection scan all index keys are generated in an external sorter - similar to foreground index build.
  - During this time, all the index keys from the inserts are side written into a temporary table.
    Documents are written to the collection as normal. Only index keys are written to the temp table.
4. After completing collection scan, keys are indexed in order.
   - Temp table is drained of the index keys, and index is created with ordered index keys from temp table.
5. If the index being created is a unique index, duplicate key violations are checked.
   - The second temp table is used to keep track of duplicate keys.
   - Only a the end of the index build process are constraint violations checked and error is returned.
6. The temp tables are removed and locks are released.
 

Sunday, February 2, 2020

GeoJSON

Format for encoding geographic Data Structures

location : {
  "type" : "point",
  "coordinates" : [12.2, 13.1]
}

GeoJSON supports following geometries -
 - Point
 - MultiPoint
 - Polygon
 - MultiPolygon
 - LineString
 - MultiLineString

 - GeometryCollection
 - Feature
 - FeatureCollection

Geospatial Indexes 

2dsphere - support queries that calculate geometries on earth like spheres
     db.collection.createIndex( { <location field> : "2dsphere" } )
2d - support queries that calculate geometries in s 2D plane
     db.collection.createIndex( { <location field> : "2d" } )

These indexes can not cover a query.
These indexes can not be used in shard key.

Following geospatial operations are supported on sharded collections - 
  - $geoNear aggregation stage
  - > 4.0, $near and $nearSphere query operators

geoHaystack - optimised to return results over a small area. Improves performance of queries that use flat geometry. For queries using spherical geometry, better to use 2dSphere indexes. 
geoHaystack requires first field to be location. 
Creates buckets of documents from same geographic area to improve query performance. 
Each bucket contains all documents in specified proximity to a given latitude and longitude.
sparse by default. 

https://docs.mongodb.com/manual/tutorial/query-a-geohaystack-index/#geospatial-indexes-haystack-queries


Query Selectors

$geoIntersect - selects geometries that intersect with the given geometry
    2dSphere supports this. Does not require the geospatial index.

$geoWithin - selects geometries within bounding geometry. Both 2dSphere and 2d indexes support this
     $geometry, $centerSphere
   
     $centerSphere takes distance in radians.
     3963.2 miles - radius of earth. Divide the distance by this to get result in radians

$near - returns geospatial objects in proximity to point. Both 2dSphere and 2d indexes support this

$nearSphere - returns geospatial objects in proximity to a point on a sphere
Both 2dSphere and 2d indexes support this
      $geometry, $minDistance, $maxDistance


$geoNear - AggregationFramework
   - near .. GeoJSON point
   - spherical .. must be true if using a 2dSphere index
   - maxDistance, minDistance .. meters
   - distanceField
   - distanceMultiplier

1609.34 converts miles to meters

Geometry Specifiers -

$box - specifies a rectangular box using legacy coord pairs for $geoWithin. 2d index

$center - specifies a circle using legacy coord pairs for $geoWithin. 2d index

$centerSphere - specifies a circle using either legacy coord pairs or GeoJSON object with $geoWithin.
2d or 2dSphere index.

$geometry - specifies a geometry in GeoJSON format for search with geoSpatial operators.
uses EPSG:4326 as default Coordinate Reference System(CRS).

$geometry: {
   type: "<GeoJSON object type>",
   coordinates: [ <coordinates> ]
}

s.find({ "o": {$eq:"99e01530-4538-4e26-b834-0072011af7ed"},
         "t":{$gte: 0, $lte: 176829836394000 },
         "l": {$geoWithin:
                 {$geometry:
                    { type: "Polygon",
                         coordinates:[
                              [[59.907542457804084, 86.79600024595857],
                               [59.907542457804084, 80.64036540314555],
                               [66.908002791926265, 80.64036540314555],
                               [66.908002791926265, 86.79600024595857],
                               [59.907542457804084, 86.79600024595857]
                              ] 
                         ]
                    }
                 }
              }
       }


$maxDistance - specifies max distance in meters to limit the results of $near and $nearSphere.
2d and 2dSphere indexes.

$minDistance - specifies min distance in meters to limit the results of $near and $nearSphere.
2dSphere indexes only

$polygon - specifies polygon in legacy coord pairs for $geoWithin. uses planar geometry.
2d Index. Can even query without 2d Index.

$uniqueDocs - deprecated. modifies $geoWithin and $near queries to ensure that even a document that matches query multiple times, is returned only once.




https://docs.mongodb.com/manual/reference/operator/query-geospatial/










Thursday, January 30, 2020

MongoDB Atlas Cluster Auto-Scaling


Option in Atlas - 

Available for M10+ clusters using General storage class.


- Can specify a minimum cluster size
- Can specify a maximum cluster size
- Can choose to enable/disable downsizing

Works on a rolling basis. Process does not incur any downtime. 

Cluster Tier scaling     Beta

Metrics taken into account for Cluster Tier scaling up - 
If one of below is true
  - cpu util   ........ >75% for past 1 hour
  - memory util .. >75% for past 1 hour

Metrics taken into account for Cluster Tier scaling down - 
if both of below are true
  - based on data collected in past 72 hours, desired cpu and memory util won't exceed 50%
  - cluster has not been scaled down in past 72 hours

Storage Tier scaling

When Disk usage exceeds 90% -
 AWS/GCP - increase capacity such that the disk usage becomes 70%
 Azure - double the capacity










Friday, January 24, 2020

Perfect Forward Secrecy

Prior to PFS, all data transmitted between the server and client could be compromised if the server's private key were to be disclosed.
Attacker to store the encrypted media until the time they could have their hands on the private key.

During initial handshake, client creates the Pre-master Secret (PMS 😂)
PMS is then encrypted with the server's public key and sent to server. Server decrypts it with its private key.
Using the PMS, both server and client generate the symmetric session keys (Master Secret)
If the attacker were storing all the data right from the handshake, once server's private key is disclosed, the attacher can get the PMS, generate MS and decrypt all the data sent thereafter.

To enable PFS, the client and server should be able to use a cipher suite that includes Diffie-Hellman key exchange. Key exchange should be ephemeral i.e. the client and server need to generate a new set of D-H parameters (key material) for each session.
The key material is exchanged in clear text.
The key material the goes through a complex math magic of D-H to generate the Shared secret.
This is too complex for the attacker to do (why .. not sure 🤔)

Now, even if private key of the server is compromised, it does not aid the attacker to get the session key because it was never encrypted with the server's public key.

Further, new DH-parameters are chosen for each session, and a new shared secret is generated. Thus, even if by some luck the attacker is able to compromise the shared secret, he will have done it for only that session.

How to get PFS ?
 - Ensure you have TLS 1.2
 - choose correct cipher suite and ensure they are correctly ordered.

Heartbleed -

An OpenSSL bug that allows an attacker to extract data from server's memory e.g. server's private key.
CVE-2014-0160 


https://scotthelme.co.uk/perfect-forward-secrecy/

Perfect Forward Secrecy does not prevent a Man in Middle Attack from impersonation as a server. 
PFS means that if an attacker obtains your private key in the "future", they can NOT decrypt your "past" communications they might have recorded




Sunday, January 12, 2020

JavaScript ES6 Arrow functions

A little dive into => syntactic sugar.

Before :

> hello = function() { return "Hello World !" }
function() { return "Hello World !" }
> hello()

Hello World !

Now :

> hello = () => { return "Hello World !" }
() => { return "Hello World !" }
> hello()

Hello World !

> hello = () => "Hello World !"
() => "Hello World !"
> hello()

Hello World !

> [1,2,3].map( function(n) { return n*n } );
[ 1, 4, 9 ]
> [1,2,3].map( n => n*n );

[ 1, 4, 9 ]


























Friday, January 10, 2020

TLS


Transport Level Security (TLS) or Secure Sockets Layer (SSL) as it was previously referred to, uses symmetric and asymmetric encryption to secure network traffic. It allows client to verify that the server is who it claims to be.

Symmetric Key Encryption

A single key is used for both encryption and decryption. 
Now the important thing - the entities using symmetric key encryption must exchange the key so that it can be used in the decryption process - and is the main cause of nervous jitters.

AES/GCM/DES

Asymmetric Key Encryption (aka Public key cryptography)

Here the encryption and decryption keys are different. 

One key is called public key, which is made available to everyone. 
The other is called private key, which is kept private by the key owner.

This solves the problem of key distribution.

RSA/DHEC

Symmetric key encryption is faster and cheaper than asymmetric key encryption, which is very expensive in terms of computation cycles. 

X.509 Certificate 

The X.509 certificate is used to verify that a public key belongs to the entity contained within the certificate.

The certificate contains information about the identity to which the certificate is issued and the indetity that issued it. It contains info like - 
 - version
 - Serial #
 - Algo info 
 - issuer DN - name of issuing authority (the CA)
 - Validity period
 - Subject DN - name of identity the certificate is issued to
 - Subject public key

Signed Certificate Generation

1. User generates the Public and Private key using PKI infrastructure
    
UserPub   
UserPri

2. User generates the User Identity information

UserId

3. User generates the Certificate Signing request and sends it to the CA

CSR = fhash(UserPub + UserId)

4. CA encrypts the CSR with its private key to create the signature.
    CA's public key is known to the world

Sign = fencrypt[CA_pri](CSR)

5. Signed certificate is generated -

Certsigned= UserPub + UserId+ Sign


To verify authenticity, the user provides the signed certificate.

The receiver can then decrypt the signature with the CA public key, and compare with the fhash(UserPub + UserId)

If they match, the receiver can be certain that the Certificate is valid. 

How TLS uses Certificates

1. Client -> Server
    Client Hello
    TLS version number
    List of cipher suites that client supports
2. Server -> Client
    Server chooses the cipher suite
    Server Hello
    Signed Certificate
    Hello Done
3. Client -> Server
    Client validates certificate
    Generates Pre-Master Secret
    Generates Symmetric Key using Pre-Master Secret
    Encrypts Pre-Master Secret with the server public key
    Sends the encrypted Pre-Master Secret to server (key exchange)
    Client Finished
4. Server -> Client
    Server decrypts the Pre-Master Secret using its private key
    Generates Symmetric key using Pre-Master Secret
    Change Cipher Spec
    Server Finished
5. Client <-> Server
    Encrypted communication using Symmetric Keys starts


Self signed certificate

Create a Certificate Authority

openssl req -newkey rsa:2048 -new -x509 -sha256 -extensions v3_ca -out ca.cert -keyout ca.key -subj "/C=OZ/ST=SA/L=SYD/O=Oracle/CN=kubersCA.com" -nodes


sudo /bin/rm -f /etc/pki/CA/index*
sudo /bin/rm -f /etc/pki/CA/serial*


sudo touch /etc/pki/CA/index.txt
echo 1000 | sudo tee  /etc/pki/CA/serial

This generates CA certificate and  CA private key
ca.cert
ca.key

Request a certificate

Certificate Signing Request (CSR) is our unsigned form to send off to the CA.

host="orclbykuber"
clusterdesc="kuberstest"

openssl req -newkey rsa:2048 -nodes -new  -sha256 -out ${host}.csr -keyout ${host}.key -subj "/C=OZ/ST=SA/L=SYD/O=MongoDB/OU=$clusterdesc/CN=$host"

This generates 2 files -
orclbykuber.key <<<<< private key
 orclbykuber.csr <<<<<< Certificate Signing Request


Sign the certificate

sudo openssl ca  -in ${host}.csr -out ${host}.cert -keyfile ca.key -cert ca.cert -outdir . -batch

This generates the signed certificate

orclbykuber.cert


Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 4096 (0x1000)
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: C=OZ, ST=SA, L=SYD, O=Oracle, CN=kubersCA.com
        Validity
            Not Before: Jan 10 18:17:24 2020 GMT
            Not After : Jan  9 18:17:24 2021 GMT
        Subject: C=OZ, ST=SA, O=Oracle, OU=kuberstest, CN=orclbykuber
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:9d:ed:a7:86:30:3d:3e:9f:0c:83:aa:2d:a6:8a:
                    f1:49:a0:b1:54:d2:9d:08:12:54:73:f4:87:68:b5:
                    0d:3d:70:ab:6e:69:06:64:60:20:ad:e0:e3:a2:a5:
                    48:ec:1c:2e:b9:67:e2:64:ba:7a:15:85:4a:21:24:
                    a6:4d:31:c7:8a:7c:ba:ab:b9:44:78:01:80:ea:4b:
                    59:9b:c1:5a:64:be:dd:0a:89:59:ed:2c:41:ab:0f:
                    d1:dc:77:d3:0a:a3:7a:77:5f:1b:3a:45:e5:13:89:
                    cd:0e:c4:86:a3:0c:74:a0:15:f0:15:54:96:c2:66:
                    69:a1:7f:fb:9e:81:37:93:9f:5a:d3:b2:84:95:04:
                    2a:3e:7e:6c:75:0e:c9:01:ae:a6:fd:5e:dd:29:80:
                    3c:21:64:8d:04:24:b5:0d:4d:0c:45:96:7f:63:ad:
                    d4:80:c1:71:1b:fb:b1:9a:ef:c9:ea:ef:fd:7a:da:
                    7d:4d:64:6b:2e:5b:00:c5:88:b7:eb:88:d3:76:dd:
                    43:93:07:f0:92:b3:a9:24:1a:c5:f8:03:aa:5d:20:
                    2b:75:4a:b7:86:de:42:50:7d:1b:a4:e7:20:6e:b0:
                    4d:a8:54:2e:7c:d7:1a:77:6c:ed:eb:c2:fe:22:c9:
                    de:2f:d1:f7:d6:62:83:b1:2e:a9:11:dc:93:ec:39:
                    9f:89
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Basic Constraints: 
                CA:FALSE
            Netscape Comment: 
                OpenSSL Generated Certificate
            X509v3 Subject Key Identifier: 
                66:4C:1C:CC:C2:AD:0D:A3:B5:FB:9A:43:B7:96:99:83:82:94:B1:EA
            X509v3 Authority Key Identifier: 
                keyid:A1:0C:15:02:3D:10:38:FF:0F:5B:DB:C0:F4:03:33:CE:4A:B6:C4:B1

    Signature Algorithm: sha256WithRSAEncryption
         a4:41:30:d7:aa:a3:2b:04:3e:0f:32:bd:81:e8:18:9b:94:4d:
         e4:7f:05:b9:d4:5d:79:34:f1:0e:52:ee:b9:22:02:4d:2c:aa:
         91:e7:da:d1:57:21:7e:17:9d:fc:2e:ed:55:3d:3b:87:63:35:
         76:35:78:d5:64:03:b6:a1:22:67:d3:4e:94:dc:8e:32:91:46:
         c5:e0:6e:d1:fa:40:c1:fe:45:e5:65:45:97:8b:22:ad:0a:ba:
         aa:7d:a6:84:69:7c:94:37:6f:07:72:b6:b3:c5:73:4d:79:16:
         2b:60:88:dd:01:18:ee:6f:ca:b3:2a:cd:54:33:7d:55:f4:af:
         a9:b4:94:aa:37:75:7c:f8:9c:dd:e6:69:27:42:fe:76:6b:2b:
         68:0b:b5:72:a2:29:7e:19:59:0c:b2:b8:80:ac:26:b5:b7:93:
         8a:d5:cb:e1:a3:8e:c9:a2:ce:34:3a:ed:ba:eb:4c:25:f4:a2:
         ee:5f:8e:91:b1:e1:05:13:83:33:40:31:2c:cf:e4:07:6a:b2:
         2f:91:c0:78:a2:a3:d6:c5:c1:0e:ca:60:64:b5:af:23:a8:4a:
         e6:b8:35:90:0d:72:6e:09:3d:cb:ff:fd:2a:32:5c:24:47:87:
         c3:d0:b1:a8:b4:5a:d1:ce:2e:0b:c7:3c:45:3f:8d:0e:68:02:
         22:68:f4:19
-----BEGIN CERTIFICATE-----
MIIDwDCCAqigAwIBAgICEAAwDQYJKoZIhvcNAQELBQAwXTELMAkGA1UEBhMCR0Ix
ETAPBgNVBAgMCFNjb3RsYW5kMRAwDgYDVQQHDAdHbGFzZ293MRAwDgYDVQQKDAdN
b25nb0RCMRcwFQYDVQQDDA5teWxpdHRsZWNhLmNvbTAeFw0yMDAxMTAxODE3MjRa
Fw0yMTAxMDkxODE3MjRaMGwxCzAJBgNVBAYTAkdCMREwDwYDVQQIDAhTY290bGFu
ZDEQMA4GA1UECgwHTW9uZ29EQjETMBEGA1UECwwKa3ViZXJzdGVzdDEjMCEGA1UE
Awwaa3ViZXJnYXVyMDEubWRicmVjcnVpdC5uZXQwggEiMA0GCSqGSIb3DQEBAQUA
A4IBDwAwggEKAoIBAQCd7aeGMD0+nwyDqi2mivFJoLFU0p0IElRz9IdotQ09cKtu
aQZkYCCt4OOipUjsHC65Z+JkunoVhUohJKZNMceKfLqruUR4AYDqS1mbwVpkvt0K
iVntLEGrD9Hcd9MKo3p3Xxs6ReUTic0OxIajDHSgFfAVVJbCZmmhf/uegTeTn1rT
soSVBCo+fmx1DskBrqb9Xt0pgDwhZI0EJLUNTQxFln9jrdSAwXEb+7Ga78nq7/16
2n1NZGsuWwDFiLfriNN23UOTB/CSs6kkGsX4A6pdICt1SreG3kJQfRuk5yBusE2o
VC581xp3bO3rwv4iyd4v0ffWYoOxLqkR3JPsOZ+JAgMBAAGjezB5MAkGA1UdEwQC
MAAwLAYJYIZIAYb4QgENBB8WHU9wZW5TU0wgR2VuZXJhdGVkIENlcnRpZmljYXRl
MB0GA1UdDgQWBBRmTBzMwq0No7X7mkO3lpmDgpSx6jAfBgNVHSMEGDAWgBShDBUC
PRA4/w9b28D0AzPOSrbEsTANBgkqhkiG9w0BAQsFAAOCAQEApEEw16qjKwQ+DzK9
gegYm5RN5H8FudRdeTTxDlLuuSICTSyqkefa0Vchfhed/C7tVT07h2M1djV41WQD
tqEiZ9NOlNyOMpFGxeBu0fpAwf5F5WVFl4sirQq6qn2mhGl8lDdvB3K2s8VzTXkW
K2CI3QEY7m/KsyrNVDN9VfSvqbSUqjd1fPic3eZpJ0L+dmsraAu1cqIpfhlZDLK4
gKwmtbeTitXL4aOOyaLONDrtuutMJfSi7l+OkbHhBRODM0AxLM/kB2qyL5HAeKKj
1sXBDspgZLWvI6hK5rg1kA1ybgk9y//9KjJcJEeHw9CxqLRa0c4uC8c8RT+NDmgC
Imj0GQ==
-----END CERTIFICATE-----

Create the PEM file

Privacy Enhanced Mail (.pem) file is created by gluing together user private key and the signed certificate.

cat orclbykuber.key orclbykuber.cert >> orclbykuber.pem

Ensure permissions on pem file are secure permissions - 0400 or 0600

Verify the certificate's legitimacy using CA's certificate

openssl verify -CAfile ca.cert orclbykuber.pem





https://www.youtube.com/watch?v=cuR05y_2Gxc

ssh-agent

scp -i myorclblog.pem myorclblog.p* ec2-user@52.51.45.183:

1. start the ssh agent

eval "$(ssh-agent -s)"

2. add the key to ssh

ssh-add myoracleblog.pem