Monday, November 25, 2019

Anatomy of a check scam


I was walking home on a Saturday afternoon across a shopping center which has a Bank of America branch (where I have an account). All of a sudden, a guy walked up to me and told me an interesting story. He said he wasn’t a bum or anything, he had a lot of money. He then basically said his sister had been in a car crash and the insurance company had sent her a cheque which needed to be cashed. He couldn’t do it as he didn’t have a BOA account, so could I cash the cheque out and give him the money instead and I’d get 40$ that was remaining. Now I didn’t really care about the 40$ but went “What’s the risk here? Maybe its legit?” So I say okay sure.

Then he asks me for my name (not email or SSN or anything) which sort of got me thinking about this but I still go ahead. Then he texts someone and says “Yeah she’ll be right over”. Then she doesn’t come over but her “boyfriend” does come over. There’s a cheque with the “boyfriend” who thanks me a lot and in the “Pay To” section there’s my name with an amount of 4,963$ or whatever written. I’m still so fixated by the idea that someone needs help that I’m not thinking “Which insurance company issues a blank cheque?” And why?

Anyway, we go in to the bank and there’s a long line. Something intuitively is not feeling right at this point, so I look at the cheque and start studying its contents. It says “deposit refund” in the memo. Hmm? I ask why and he says “yeah its refunding my deposit”. So I think okay? Maybe its a downpayment that is being refunded? I don’t know. I continue standing in the line.

The line’s really big though and my spider sense is still tingling. I don’t know why. Maybe its because I’ve worked in security a lot, maybe it’s my brain intuitively guessing something’s off but I take out my phone and simply take a picture of the cheque without asking the dude. A few minutes later the guy is like “I think I’ll just ask him to do a direct deposit”. I’m like “You sure?” He says yes and we walk away. No harm done. As I walk away I see the first guy (from the back) taking to someone. Hmm I wonder why..

Anyway I go home and something’s still gnawing at me. I tell my wife all this and she’s interested too. She immediately thinks though it’s a fraud. Don’t know how. The address on the cheque is legit so it lowers the probability of a forgery although that can’t be ruled out either. The person whose address it is doesn’t pick up so I leave a voicemail and call BOA. BOA’s fraud department is immediately helpful and after listening to the story goes “By the time you finished, I was thinking - Please tell me you didn’t give them your money”. And I’m like “No, but how does this work?”.

In short the check is forged (somehow) and the bank cashes it. The scammers take the cash and run. Later the check turns out to be a forgery and the bank claims the money from the person they paid it out to. And there’s nothing they can do about this either. I’m on the hook coz I’m the one they paid out to. The person whose name was on the cheque returns my call Sunday afternoon and I tell them all about this. Turns out their office was broken into, their card and cheques stolen. So the cheque was not forged, they’d have been liable and I don’t know if I’d have then been liable as well and put into jail or whatever.

All’s well that ends well and I’ll call the cops and make a report too but I got really really lucky this time. The thing that bothers me the most is that I’m someone who works in security all the time and I really should not be so trusting of someone just because that communication is not digital. If the same thing had been digital, I’d have caught all the red flags inside 5 minutes or less. But just because it’s a person.. face to face… I let my guard down. I failed this time.

To conclude - the only advice I can give anyone is to remain calm and keep thinking when weird, uncertain shit like this happens. If you do, you have a good chance of being safe. Oh and a funny note, I also got to hear my wife say “What would you ever do without me around? You’re so gullible :)”. Happy holidays everyone.

Saturday, June 1, 2019

AWS - Security, Identity and Compliance

This blog defines a number of services that are relevant to AWS security. It is recommended that you know all these services as well as possible.


IAM: This is the heart of all the authentication and authorization that AWS services perform. If there's one service you should learn in and out this is it. Admins can create IAM users and roles, associate access keys (for programmatic access) and assign permissions to each user and role. Developers can use the access keys to programmatically invoke all AWS services, subject to the permissions assigned to the user/role. Additionally almost all (if not all) services create service-linked roles and assume IAM roles to perform operations in another service. Here is one such example. It is possible to use IAM to control access by a user to an entire service, specific APIs in a service or in many cases to specific resources as well.

Resource Access Manager: This is a service that allows one account to share resources with another account. The person who uses the shared services can perform actions similar to the owner of the resources. This helps reduce operational costs and also the overall attack surface (due to there being lesser things to manage). However there are only a few resources that can be shared as of now. Here is a walkthrough of this service by the ever helpful Jeff Barr.

Cognito: Cognito handles authentication for web and mobile applications. This is Amazon's user directory against which users can authenticate against a user-pool and obtain a user-pool token. Users can authenticate directly against user records stored in Cognito or use a SSO provider such as Google or Facebook to authenticate. The user-pool token is then integrated with an identity pool to obtain temporary AWS credentials using the STS service (which does not have a web console :)), transparent to the user. These credentials are then used to access AWS resources.

Secrets Manager: Like the name suggests, this stores credentials in a secure manner using KMS. Instead of hard-coding credentials in source code or configuration files, they can be stored in a vault such as Secrets Manager. Applications can retrieve these credentials at run-time to implement their functionality. Passwords, API keys or anything else that is considered a secret can be stored here. Automatic rotation of these credentials is also possible for RDS (MySql, PostGres and Aurora) database passwords.

Guard Duty: This is a security monitoring tool that continuously studies different logs (Cloudtrail, VPC etc) and generates security findings. Rules in Guard duty are part AWS, part from AWS's security partners and users can themselves customize Guard Duty rules to help detect threats.

Inspector: This involves installing an agent on an EC2 instance that then scans for open ports, verifies if an instance is vulnerable to known CVEs or verifies the system against CIS benchmarks. In short it is Amazon's vulnerability scanner (for a few items) aimed at helping EC2 instance owners secure their instances better. If you're managing your instances yourself, this seems like a useful service to have, if you're willing to pay the extra money :). Note that charges are per instance so if you only have a few servers, this could be pretty cheap.

Macie: This is a fancy (fairly pricey) tool that AWS has to detect data leakage of specific information from S3 buckets (upto 3 Tb in size). It classifies data based on numerous very specific rules (for e.g 1 and 2). It's also integrated with KMS which means there is a way to scan bucket content that is also encrypted.

Single-Sign-On: This allows AWS to function as a SSO solution while being tightly integrated with a number of AWS services. It integrates with AWS Directory so you can store all your user information there and authenticate against it. Additionally, if you authenticate successfully once it will allow you access to all of the services across all the AWS accounts, that are integrated with SSO. There's also a way to migrate your entire Active Directory to AWS so your users can continue using the same passwords. It's very similar to IAM, in a way - except that IAM is just for the single account. Here is a good article about how AWS SSO works.

Directory Service:  This is AWS's version of Active Directory. You can use SimpleAD which provides some features allowing easier management of EC2 instances. A more powerful version is the AWS Managed AD solution which allows you to access AWS apps, manage instances, use Azure Cloud apps, authenticate to an on-premise Active Directory over a VPN connection or share an AD domain hosted in another AWS account. You could also use an AD connector to allow EC2 instances to join an on-premise Active Directory. Users can then access the applications running on EC2 while authenticating against the on-premise Active Directory.

Certificate Manager:  This is AWS's certificate authority solution that helps users of applications use certificates to secure communication to them over TLS. You can create certs inside ACM or import certificates from outside. ACM is integrated with a few other common services (not all). The certificate's private key is stored securely and encrypted using KMS.

Key Management Service: This is the AWS key vault that securely stores data keys that are used to encrypt data. You can choose to let AWS create an AWS master key or create a customer managed key yourself. This key never leaves KMS. The master key encrypts the data key, which is the key that you actually use to encrypt/decrypt data outside KMS. You can choose to create the data key outside and import it to KMS, where the master key encrypts it. This is envelope encryption, which offers better security compared to single-key encryption. Almost every piece of data needs encryption these days and very predictably - a lot of them are integrated with KMS.

Cloud HSM: A HSM is a server that contains specialized hardware optimized to perform cryptographic operations. It helps with operations such as these. HSM's are costly - be sure you need them. In CloudHSM you create a cluster, and then add HSMs to the cluster to help with data redundancy. KMS additionally integrates with CloudHSM to help store keys even more securely.

WAF and Shield:  WAF is a web app firewall that monitors requests and allows/blocks traffic to the web-server that hosts content. You can choose which requests are acted upon. Shield helps protect applications against DDOS attacks. It has a Standard and Advanced mode (as the name suggests offers more protection). If you know what you're doing and don't have any fancy requirements, Shield Standard should be good enough.

SecurityHub: This is a one-stop to view the results of security scans done by Guard Duty, Inspector and Macie. Additionally, scan results from other partners are also listed here. It also claims to help businesses be compliant against CIS benchmarks.

Artifact: This is where you can go to look at all your agreements with AWS and manage them. Additionally, you can download numerous reports published by 3rd parties, verifying Amazon's compliance with numerous regulations.

Friday, May 24, 2019

AWS - Networking Services

VPC: This is the DMZ/Vlan/Segmentation equivalent for the cloud. You can create a VPC, create subnets inside the VPC and then assign EC2 or RDS instances (or anything that needs an IP address) addresses inside individual subnets. You can then set ACLs on the VPC or individual subnets (in addition to security groups on the instances itself) to control inbound and outbound communication. You can have private and public (internet facing assets) subnets in a VPC. You can have these things called private VPC endpoints for public services such as KMS (cryptography), that ensure that traffic to KMS, instead of being sent over the Internet is sent exclusively over the AWS network. This is one of those services that you will almost certainly use, if you are on the cloud so do be familiar with it. :)

CloudFront: It is usually a common practice to use a CDN to cache static content in locations closest to the user (edge of the network) so round trips to the web server and DB server can be avoided. Now though, even dynamic content is retrieved by edge locations close to the destination servers and served to the end user. AWS Cloudfront claims to take a look at the requests coming in and making decisions on what dynamic content to serve to whom.

Cloudfront is also integrated with Web App Firewalls and DDOS protection services to protect services against malicious attacks. It additionally integrates with Lambda (run functions based on specific events), handles cookies (possibly for authenticated requests) and ACM so that a specific certificate is shown to the end user. Here is a good article about how CDNs work, along with a nice diagram at the bottom.

Route53:This is AWS's DNS service. It allows users to register their domains, configure DNS routes so that users can reach their application as well as check the health of web servers that are registered with Route 53.

API Gateway: This allows users to create HTTP REST & WebSocket APIs for any functionality they want to implement. You can integrate the API with HTTP (Query string parameters), call a Lambda function when an API is called, integrate it with other AWS services and then return a response to the end user.

Direct Connect:  This establishes a physical, link between the end user network and an Amazon location that supports Direct Connect. For this purpose fiber-optic cables that support either 1 Gbps or 10 Gbps must be used and the customer network devices must meet certain requirements. The main purpose of this service is to speed up data transfer between an on-premise network and an AWS service by bypassing a lot of the public Internet. This can be public like S3 or privately hosted inside a customer VPC. The other key factor is that this is apparently much cheaper than accessing S3 or VPCs over the Internet. Here's one such implementation.

App Mesh: Microservice architectures are quite common these days. The greater the number of microservices though, the greater is the management overhead from a monitoring perspective. Once there are applications already running somewhere (EC2 for example), App Mesh, built on Envoy can be configured such that traffic to every single micro-service of the application first passes through App Mesh. Rules configured on AppMesh can then determine the next steps to be taken. This is better than installing software on the OS of every microservice host and have them communicate to diagnose problems.

Cloud Map: This allows you to create user-friendly names for all your application resources and store this map. This can all be automated so as soon as a new container is created or a new instance is spawned due to more traffic, its IP address can be registered in CloudMap. When some micro-service needs to talk to another service, it'll look it up in CloudMap. This hence means that you no longer need to maintain a configuration file with locations of your assets - you can just look them up in CloudMap.

Global Accelerator:  Global accelerators once configured provide the user with a static IP address, mapped to several other servers. The traffic that hits the global accelerators will be redirected over routes in the AWS network to hosts close to the user's location and that have less load, so that the overall availability and performance of the applications improves. The aim is that traffic doesn't hit nodes that are not performing that well.

Thursday, May 23, 2019

AWS - Migration Services

Application Discovery Service: This one's to find out what offline servers you have and make a list of all that to then display them in the console online. For VMware VCenter hosts there's an AWS VM you have to install that'll do the discovery. Alternatively you can install an agent on every offline host you want tracked online. The last way is to fill out a template with a lot of data and import it into the console.

Database Migration Service: This is pretty self explanatory in that it allows you to migrate from an AWS data store to another AWS data store (support for Aurora, MySQL and plenty others) or to/from an on-premise instance. You can't do on-premise to on-premise :). The source database can apparently remain live throughout the migration which AWS claims (and probably is - idk) is a great advantage.

Server Migration Service: Just like the previous service helps migrate on-premise databases, this one helps migrate on-premise servers in VMWare, Hyper V and interestingly Azure to AWS. A VM is downloaded and deployed in VMware Vsphere. This then (when you say so) starts collecting the servers that you've deployed in VSphere and deploys it as Amazon Machine Images (AMI) to the cloud. These images can then be tested by creating new EC2 instances using these AMIs to see if they're functional before deploying them to production.

AWS Transfer for Sftp: This is quite simply just a managed Sftp server service that AWS has. The aim is to tempt people away from managing their own SFtp servers offline and migrate data to the cloud. It supports password and public key auth, and stores data in S3 buckets. All normal SSH/SFTP clients should work out of the box. Authentication can be managed either via IAM or via your own custom authentication mechanisms.

AWS Snowball: This is an appliance that you can ship to your data-center, copy all the data (upto 80 (Snowball) -100 (Snowball Edge) TB) to it over your local network and then ship the box back to AWS. AWS take that box and then import all the data into S3. The key win here is that you don't need to buy lots of hardware to do the transfer but can use AWS's own appliance instead. Also it saves a ton of bandwidth because you're doing local transfers instead of over the internet.

Datasync: Contrary to Snowball, Datasync transfers data to/from customer NFS servers to/from S3 or EFS over the network at high speeds using a custom AWS Datasync protocol (claim is upto 10 Gbps). Alternatively they can choose to go from NFS in the cloud to S3 also in the cloud. A DataSync agent is installed as a VSphere OVA in case of an on-premise server after which you add the various locations and configure them as sources or destinations. Finally a task starts and data is transferred between the 2 locations. Here's a nice blog demonstrating this.

AWS Migration Hub: This is sort of a 1 stop for starting off collection or data migration using the various other services that AWS has. Some of these were already mentioned above (Server and Database migration services). In addition there are some integrated migration tools (ATADATA ATAmotion, CloudEndure Live Migration etc - none of which I've heard of :)) that one can use when performing this migration. There is no additional cost to use this service - you pay for using the individual tools themselves.

Tuesday, May 21, 2019

AWS - Database Services

RDS: AWS's relational database system which is basically hosting MySQL, PostGres, MSSQL, Oracle, Amazon's own AuroraDB and MariaDB, an open-source clone of MySQL. Applications that are on application servers at data centers or hosted in the cloud can both use RDS as a data source and customize the DB instance (the basic unit of RDS) with the hardware and memory that they want. The databases can all be administered using the respective clients. AWS Networking and Backups are integrated with RDS.

DynamoDB: AWS's NoSQL database which stores data in JSON key-value ("a" : "test") format. Instead of writing SQL queries like with a relational database, you write NoSQL queries that query JSON. It integrates with AutoScaling that changes the read and write capacity of the database, depending on request volume. It also integrates with KMS allowing you to encrypt data at rest on the fly. It claims to scale really well horizontally (throw more computers at the problem). DynamoDB also has a HTTP API that you can use to directly query it. As usual, the devil is in the details and it is probably not for everyone. There's a nice blog which has a cool flowchart about when one should and should not use DynamoDB.

Elasticache: This is an in-memory database that supports Redis and Memcached. The point of an in-memory DB is to increase the speed of resolution, so users do not have to wait longer to use services. In other words it is a layer of abstraction before the database. If a user's request can be served from Redis cache, it will be done - and done faster than a round trip to the database. Here is a link to a comparison between Redis and Memcached.

Neptune: This is a graphing database. It is largely useful when there are large sets of data that are related to each other. The inter-related data is stored in the database and users can query it using languages built specifically for graphing (Apache Tinkerpop Gremlin and Sparql). Its interesting that the smallest DB instance that you can provision from inside Neptune is db.r4.large (~16 GB RAM) - which by itself shows that this is a product used for very large data sets.

Redshift: This is AWS's enterprise data warehousing solution. In other words it means that it helps analyze petabytes (if you want) of data from a variety of sources such as S3, Glacier, Aurora and RDS. There's a lot of database design that's needed, so I'm guessing (do not know for sure) that things can get pretty complex, pretty soon. Once the data is inside a RedShift cluster (for example: copied from S3), you can run SQL queries against it and make complex queries against the cluster. If you don't have huge amounts of data you probably do not want RedShift.

DocumentDB: This is basically there so you can migrate all your MongoDB content to the cloud while continuing to use all the Mongo relevant clients and tools. All you then do is change the DB endpoint to point to the DocumentDB endpoint in the cloud. The cool bit here is you can autoscale the storage your DB needs and the read capacity (how many queries can you make) so large applications are easily served. This too has the smallest instance as a db.r5.large instance with 16 GB RAM. So that feels like this too is a production service and might be expensive for smaller loads. I don't know that for a fact though - so please do your testing :)

AWS - Storage Services

S3: This is arguably (along with EC2) the most popular service that AWS offers. In short it allows users to store their files in it - behaving like an online file store. It has other uses such as hosting a website that has static content in it. Services very commonly store audit logs here and in short S3 is integrated with a large number of AWS services. S3 is a global service and has buckets whose names are unique - 2 users cannot create the same bucket. Files are stored inside the bucket and are called keys. For such a popular service - it does have fewer options (which are sufficient) via the AWS CLI. If you're starting to learn about AWS, this is the place to start.

EFS: This is an NFS file system that expands to the sizes of the files you are storing on it. You can use an NFS client on an EC2 Linux system to remotely mount and then read/write from/to the file system. They also have this interesting concept called lifecycle management which moves infrequently used files to a different class of EFS storage that costs less.

The GCP equivalent for this is FileStore.

FsX: This too in short is a file system that can be accessed remotely but it has been made keeping Windows systems in mind. Users who have Windows applications that need access to a lot of data over the network via SMB mapped network drives, are the targets. Linux systems too can access these mapped drives using a package called cifs-utils. It additionally also supports applications that use Lustre, a filesystem that targets applications that require a lot of computation.

S3 Glacier: If you have a large number of files that you do not want to delete (like old pictures) but do not use often S3 Glacier is the thing to use. The unit of storage for Glacier is a vault which is sort of equivalent to a bucket in S3. Only creation and deletion of vaults is through the console; everything else happens via the CLI or SDK. Additionally it also claims to be extremely low cost, which I'm not saying anything about :)


Storage Gateway: If there is an existing data-center where you already have a large number of applications that talk to databases, scaling this can become hard quickly if you have a lot of traffic. The AWS Storage Gateway is a virtual machine appliance (ESXi), an on-premise 1U hardware appliance (buy on Amazon) or even as an EC2 appliance. Once it's activated, the appliance will pick up all your data from the datacenter stores and put it on to S3, Glacier or EBS. Now you can just point your application to the new stores via an NFS client and it should work seamlessly. Here is a blog that walks you through a sample process. Additionally it allows backup applications to directly hit the gateway (configurable as a tape gateway) and backup directly to AWS S3 or Glacier.


AWS Backup: This service allows you to backup data from EC2, RDS and a few other services to S3 and then move that data to Glacier (I think) after a certain time. You can configure backup plans to decide what gets backed up (by tagging resources), when, whether its encrypted or not and when the backup is deleted. As of now it only supports a few services, but it's reasonable to assume that once it becomes more popular there'll be more services that are added to this.

Thursday, May 16, 2019

AWS - Compute - Container Services

Here is an image from the Docker website that describes how containers work.



Teams are increasingly building their workflows around Docker containers. Amazon has a few services that make this easier. This post briefly discusses each of these services.

ECR: This is a repository of pre-built images that you can build on your machine and then upload to AWS. So for example: You can build a Ubuntu image with a LAMP stack and any other custom packages and upload it to ECR. When other AWS services need to use that image for some other purpose, it is easily available.

ECS: Once the Docker images you built earlier are uploaded to ECR, one can use these images on EC2 instances to perform whatever computing tasks were specific to that container. This is where ECS comes in. Users can direct ECS to run specific containers that it then picks up, identifies EC2 instances they can be run on (creates a cluster of these) and then does so.

Once the cluster is ready, a task definition needs to be created. This defines how the containers are run (what port, which image, how much memory, how many CPUs and so on). When the task definition is actually used, a task is created and run on the cluster of EC2 images (each is called an ECS container instance) that were originally created.

An ECS agent is additionally installed on each ECS container instance that communicates with the AWS ECS service daemons itself; these agents respond to start/stop requests made by ECS.

The equivalent product on GCP is Kubernetes.

EKS: Kubernetes on Google has an architecture where there is a Kubernetes master node (the controller) and a number of worker nodes (equivalent to ECS agents on Docker containers) that send information about the state of each job to the master. The master then (similar to ECS) sends information about its various tasks that are running, to the Kubernetes daemon itself which uses it for some controlling purposes. Here is a diagram that illustrates this:



EKS on Amazon allows the Kubernetes master to be configured inside the AWS environment and allow it to communicate with deployments elsewhere, while simultaenously interacting with ELB, IAM and other AWS services.

Batch: If one has a job that one wants to schedule and run periodically while automatically scaling up or down resources as and when a job completes or takes up more memory/resources - AWS Batch is a good idea. AWS Batch internally uses ECS and hence Docker containers on EC2/Spot instances to run the jobs. Here is a nice guide that goes into an example of using Batch in a bit more detail.

Tuesday, May 14, 2019

AWS - Compute Services

This blog summarizes some of the AWS Compute services. I deliberately do not cover the ones that deal with containers, as I plan to blog separately about those. I'm looking at Google Cloud side by side from now on so I'll keep updating these posts just to mention if there is an equivalent. When I get to Azure, I'll do the same there as well :)

EC2: EC2 is one of the most popular services that AWS has. It basically allows you to spin up virtual machines with a variety of operating systems (Linux, Windows and possibly others) and gives you a root account on it. You can then SSH into it using key authentication and manage the system. What you want to use it for is completely up to you: Host a website, crack passwords as a pen-tester, test some software or really anything else.

The GCP equivalent for EC2 is Compute Engine.

Lightsail: Lightsail is very similar to EC2 except it comes with pre-installed software such as Wordpress or a LAMP stack as well and you have to pay a little money to own the server. The plus here is that it's easier for users who are non-technical to use Lightsail, compared to EC2 where you have to do everything yourself. In other words it is Amazon's VPS solution.

Lambda: This is AWS's Function-as-a-Service solution. In other words you write code and upload it to Lambda. You don't necessarily have to worry about where you'll host your code and how you'll handle incoming requests. You can configure triggers in each of these other services and then have Lambda act when the trigger is activated. For example: You can create a bunch of REST APIs and have the back-end requests handled by a Lambda function, upload files to S3 and have something happen each time a specific file is uploaded or do more detailed log analysis each time an event is logged to Cloudwatch. Lambda is integrated with a large number of AWS services so it is well worth learning it and using it better.

The GCP equivalent for Lambda is Functions.

Elastic Beanstalk: If you have some code that you've built locally and want to quickly deploy it without worrying about the underlying infrastructure you'd use to do it and don't want to spend a lot of time tweaking it - Beanstalk is the way to go. You can for example choose Python as a runtime environment, upload your Python code and let AWS then take over. AWS will create roles, security groups and EC2 instances that are needed (among anything else) and deploy your application so it is then easily accessible. If you need additional components such as databases or want to modify the existing configuration, these can be added later to the environment.

The GCP equivalent for Elastic Beanstalk is App Engine.

Serverless Apps Repository: This is a large repository of applications that have been created by users and uploaded for use by the community. One can grab these applications and deploy it in one's own AWS account. The requisite resources are then created by deploying a SAM template. The applications can be used as is or modified/code-reviewed before actually using it. If you change your mind, you can delete the CloudFormation template - this will delete all the AWS resources that were created during deployment.