February 9, 2021

Scale out workload using Azure HDInsight

Filed under: architecture,cloud Himanshu @ 12:47 pm

Scale out workload using Azure HDInsight

Scale is one of the reasons why organizations go for cloud computing, regardless of their size – enterprise or startup and whether it is for scale up or for scale out. For the uninitiated, simply put Scale Up is to add more power to single machine and Scale Out is to distribute the computing workload across multiple computers. Scale out enables the possibility of scaling to a level that is not possible with scale up. Scale out gives you virtually infinite scale.

Using the arsenal of Azure services, nice varieties of architectural patterns are possible for scaling out for large workload. Azure Functions, App Services, Azure Kubernetes clusters, HDInsight cluster, Azure synapse, Virtual Machine scale set are the most commonly used Azure services for scaling out depending on the kind of workload. We recently solved a large data processing workload using Azure HDInsight.

We used Azure HDInsight for a healthcare client that has datasets with multiple years of patient treatment data (anonymized) across multiple entire health systems. The processing itself was complex and entailed many transformations and computations. The end result was to submit the analyzed information to CMS (Center for Medicare and Medicaid Services) in a condensed and auditable manner. We designed our solution exactly for this high-demand situation. We used Apache Spark on an Azure HDInsight cluster. Azure HDInsight is a platform that makes it possible to easily create cluster of computers with preconfigured open source frameworks like Apache Spark, Apache HBase, Apache Kafka. Since we used Apache Spark; we loaded up the processing code and raw data into the cluster to complete big data processing. We spawned a cluster of about 40 computers working together. Effectively, this meant we ‘created’ a huge computer with about 2.5 terabytes of RAM (that’s right, it’s not typo, terabytes not gigabytes and RAM not disk storage!), about 325 processor core equating to about 650 virtual cores and about 16 terabytes of disk space, working together for a single large workload. Using this design and scale, we could complete the workload in about 8 hours that otherwise would have taken about 2-3 weeks to complete. Finally, we just needed this temporarily so we could tear the entire infra down after we were done thus keeping our infrastructure cost optimal.

While it sounds cliched but there is a lot of truth to the statement “Cloud computing is a big game changer”! There are several situations where having access to large computing resources can be a great competitive advantage. Thanks to AWS, Azure and other cloud providers, this can be done at a fraction of capex however your software architecture needs to support it.

September 24, 2020

configuration or constant

Filed under: Uncategorized Himanshu @ 10:35 am

Should I put this in configuration or in constant! I was reviewing code of third-party software. One piece specifically stuck to my head, configuration entry for number of milliseconds in an hour!

Don’t think it’s worth multiple paragraph, truth of the life like number of milliseconds is not good candidate for configuration entry. A nice good name and keep it constant! Even if it’s not too much effort to make it configurable.


August 2, 2020

IR Software Remote Control

Filed under: electronics Himanshu @ 12:21 pm

This weekend, I reopened my boxes hosting electronic components with target to assess and validate couple of ideas that I had to build alternate interface to my TV, alternate to TV remote.

During discovery phase or say “feasibility study” of idea, i found 2 useful resources.

Arduino IR Remote is great library to work with proprietary protocol of different manufacturers, include one that I was looking for – Sony.

LIRC is great repository for referring codes to send to electronics device for several operation, including TV, but not limited to only TV.

Below is intermediate code that accepts number on serial, and depending upon entered number, sends IR signal to my TV.

#include <IRremote.h>

IRsend irsend;

void setup() {
  // Open serial communications and wait for port to open:
  while (!Serial) {
    ; // wait for serial port to connect. Needed for native USB port only

  // send an intro:
  Serial.println("\n\nString toInt():");

void sendCommand(unsigned long signalValue){
  for (int i = 0; i < 3; i++) {  // sends code 3 times
    irsend.sendSony(signalValue, 12);

void onOff() { sendCommand(0xA90); }
void input() { sendCommand(0xA50); }

void volumnUp() { sendCommand(0x490); }
void volumnDown() {  sendCommand(0xC90); }
void mute() { sendCommand(0x290); }

void priorProgram() { sendCommand(0x890); }
void nextProgram() { sendCommand(0x90); }

void homeKey() { sendCommand(0x070); }
void okay() {  sendCommand(0xA70); }
void upKey() { sendCommand(0x2F0); }
void downKey() { sendCommand(0xAF0); }
void leftKey() { sendCommand(0x2D0); }
void rightKey() { sendCommand(0xCD0); }
void exitKey() { sendCommand(0xC70); }
void infoKey() { sendCommand(0x5D0); }

void one() { sendCommand(0x010); }
void two() {  sendCommand(0x810); }
void three() { sendCommand(0x410); }
void four() { sendCommand(0xC10); }
void five() { sendCommand(0x210); }
void six() { sendCommand(0xA10); }
void seven() { sendCommand(0x610); }
void eight() {  sendCommand(0xE10); }
void nine() { sendCommand(0x110); }
void zero() { sendCommand(0x910); }

void audio() { sendCommand(0xE90); }

String inString = "";    // string to hold input
void loop() {
  // Read serial input:
  while (Serial.available() > 0) {
    int inChar = Serial.read();
    if (isDigit(inChar)) {
      // convert the incoming byte to a char and add it to the string:
      inString += (char)inChar;
    // if you get a newline, print the string, then the string's value:
    if (inChar == '\n') {
      switch (inString.toInt()){
        case 1:
        case 2:
        case 3:
        case 4:
        case 5:
        case 6:
        case 7:
        case 8:
        case 9:
        case 10:        
        case 11:
        case 12:
        case 13:
        case 14:
        case 15:
        case 16:
      inString = "";

June 3, 2018

Microservices is more than just an architecture

DevOps is culture and not methodology! For the scope of this post, I do not plan to defend or say against the statement. But will be brave to claim that practice and patterns by which software teams (also organizations) are built/organized, it certainly does impact the culture of the team and individuals in the team.

Also, weather given practice would work for team or not also depends upon prevailing culture of the team.

Those agreeing to above two, and have pass-through adoption of Microservices in their software architecture, would likely to agree that making people accountable or have them convinced to take ownership of either a software service or any other area is easier than the teams having hard time adopting to Microservices architecture.

Those individuals and teams who understands Microservices architecture well, more likely to say “yes” for the question “Can you please own this”.

And I strongly believe that in this equation it’s not only a=b but also b=a. That is, if individual understand importance of taking ownership, she likely would understand and adopt Microservices easier than others.

December 16, 2016

World of Serverless Applications

Filed under: architecture,cloud Himanshu @ 2:48 pm

I still remember that night of year 1998. Me and my friend sat with a bunch floppies to install Novel Netware Server on a hardware physically sizing to about same as home refrigerator. And at next day morning we had server ready, giving us ability to share files among a bunch of PCs, and user management. The word “Server” would mean real big thing in those days. Exaggerated analogy could be “In those stone age days!” Lot of things changed in those stone age days v/s now, and today we are talking about Serverless architecture.

Serverless architecture really seems promising to me. It is new paradigm, and once developer toolset and frameworks matures around it, we would be going to new exciting world! Obviously Serverless architecture does not literally mean that there wouldn’t be any server, from that perspective the name is miss-leading. What in nutshell means is software engineer, and deployment team wouldn’t need to do provisioning of the software that solves business problem. Health monitoring software will take care of it, and that too depending upon load. Scalability needs starting from running on ‘No server’ to ‘N server’ will depend upon the load at a point of time. It all started when AWS one more time proving themselves to be leader in the area of cloud innovation, by introducing AWS Lambda in their offerings back in end of 2014. And recently Microsoft also published similar service in their Azure offering, called Azure Functions. Google’s GCP is also following  the lead and has offering with name Cloud Function.

Serverless architecture is Microservices and Scalability on steroid!

This are couple of scenarios that I think it fits best:

  • Startups can be most benefited with Severless architecture. In the beginning there will low usage, and hence less revenue, and hence need to have lessor cost on infrastructure and when the startup grows to large scale, the solution would react and scale to increased load automatically, and so would infrastructure cost. All of this without doing any code change, if Architected right in the beginning.
  • Another scenario could be of IoT web service endpoint to which devices connects to push or pull the data. Number of connected devices would drive the infrastructure cost dynamically.
  • Blogs and Content Management sites. Hey, this is big opportunity! There are many organizations in the world who do not want to have hassle of maintaining servers, do not want to spend a lot on infrastructure. But wants to have lightweight online presence. They will be greatly helped by having platform on which they pay by number of request coming to their site instead of fix hardware cost. What do you say? If you reach their before me, consider giving me credit for the idea :).

While all looks bright around Serverless architecture, here are few suggestions from my exploration:

  • Use coding framework that is light weight on it’s boot-up time. This will avoid having long request time after cold boot, I’m using node.js
  • While better toolsets for developers becoming ready and available for general community, consider using tools like Serverless package available on npm registry provided by Serverless.com. It works very well while working with AWS Lambda. It not only does the deployment of code on AWS Labda but also sets up HTTP endpoint in API gateway if event of Labda is configured to be HTTP endpoint. And everything works very seamlessly. Serverless package can be used even if your solution is not built in node.js
  • Never try to do lot of things together in one Lambda/Function. Break down time consuming work into multiple chunk of work-items and utilize AWS Simple Queue Service (SQS), or Azure Queues.
  • Do good enough logging, as that would be your savior to debug any issue.

Here is quick simple example. Let’s first go through requirements: Application allows user to create, view, edit and archive notes

  • User can add notes. Application records the note along with date-time when it was posted
  • While adding or editing notes, user can do formatting. Supported formatting would be making text bold, italic, and underline
  • User can view list of notes order by date-time it was posted in descending order
  • User can archive note. On archival note would be filtered out from the list of notes
  • User can review list of notes that are archived

Here is how it is structured:


I’m in process of enhancing this example application with more features and in the process will end up using more of AWS services.

  • Authentication and Authorization,
  • Storing more structure along with note, like author, tagging the note, sharing the note with other subscribers.
  • Searching note
  • Attachments
  • Notification on shared note change

And to support these features will use DynamoDB, Simple Email Service (SES), Simple Notification Service (SNS) and Simple Queue Service (SQS), and more.

Eager to see unfolding of new ways of building and deploying software!

September 10, 2015

Edge is giving priority to content

Filed under: Uncategorized Himanshu @ 12:11 pm

I’m on Windows 10, Yeah!

Today, I notice that when I press Alt+D in Microsoft Edge while having one JIRA issue open in the tab, instead of cursor moving to address bar, dashboards menu opens up in JIRA! When I’m editing this post in WordPress and press ALT+D, del html tag gets inserted in text area! However when I’m on Microsoft Edge page, it goes to address bar.

Giving priority to content rather than browser, I liked that idea. This will help having more web applications supporting keyboard shortcuts, and users can be more efficient in using applications.

I liked it!

June 3, 2015

Code refactoring is important

One great mind have said it well that “Change is the only constant in the life”

What is the difference between hardware and software? One aspect suggests that “Hardware is likely to have mechanical parts that are subjected to wear and tear, and will need maintenance and replacement to extend continues service. While software do not have any mechanical parts and it is not subject to wear and tear.” 

If we live in static world, this would be true, but unfortunately or rather fortunately we don’t live in static world. Like many other things, software or the environment within with it gets used changes. Change in the way business operates that uses the software i.e. requirement change, enhancements to support extended business needs, change in environment within which software was assumed to run, or any other assumption within which software was built proves to be wrong, these are some of the example why software needs regular maintenance.

In science entropy is well used word for explaining gradual decline in disorder. It is bound to happen, over the period of time till equilibrium is achieved. If I draw the analogy, well engineered code at one point is order state. In changing world demand towards software changes, and  it will be inevitable to have the situation in life cycle of the software that it would go towards disordered state. I have accepted this to be fact until proven wrong, and based of I would suggest business that depends on the software that put effort and money to bring the software back to ordered state, otherwise one is bound to pay higher cost down the line. Do talk to engineering team to understand accumulated disorder, and prioritize it in the backlog. Earlier it is done better it is.

May 27, 2015

SQL Server selecting from literal values

Filed under: sql server Himanshu @ 11:40 am

I didn’t knew it as possible to do this in SQL Server:

select * from (values (1, 2), (3, 4), (5, 6)) AS Numbers (Odds, Evens)

Parking it here for my later reference. Can be handy to insert couple of rows into some meta tables like:

declare @external_internal_state_map table (external_state_code varchar(6), internal_state_code varchar(6))
insert into @external_internal_state_map (external_state_code, internal_state_code) values
(‘EXT1’, ‘INT1’), (‘EXT2’, ‘INT2’), (‘EXT3’, ‘INT3’)
select * from @external_internal_state_map

May 27, 2014

Shorten the path by mapping drive letter pointing to folder

Filed under: Uncategorized Himanshu @ 12:59 pm

While working on a project, I noticed that Visual Studio was failing while doing compilation, and reason for failure was path was becoming too long.

There is subst command in MS Windows allowed me to setup a driver letter that points to physical folder to overcome the situation.

Ref: https://technet.microsoft.com/en-in/library/bb491006.aspx

August 15, 2012

Mathematics Reloaded

Filed under: electronics — Tags: , , , Himanshu @ 12:22 pm

I got stuck in one puzzle of electronics. Was understanding resistance of electron flow, current, and voltage by little different experiments. In one of them, I had created a very basic circuit that was starting from +v end of 1.5v battery (showing 1.4v on multimeter without resistance), 470k resister and multimeter into the circuit ending at –v end of the battery. On multimeter, I noticed 0.95v.  470k resister was the cause for the drop of voltage and it was dropping voltage by 0.45v (1.4v – 0.95v = 0.45v).

I thought of using one more 470k resister expecting it to drop further 0.45v and confirm that voltage drops  to 0.5v after second 470k resister. My reasoning was: 1.4v – 0.45v – 0.45v = 0.5v. But to my surprise I was seeing 0.71v on multimeter! Multimeter wasn’t in agreement of my mathematics, but I wasn’t fully convinced with multimeter’s mathematics. All I was thinking is “Why” second resister was not dropping the voltage as equally as the first.

After further thinking and learning on V=IR, I notice very simple and basic property in numerical series in which next number is created by adding 1 to the predecessor viz. 1, 2, 3, 4, …n. And that simple and basic property is: 2 is double of 1 but 3 isn’t double of 2. Further, weight of 3 is 50% more than 2, but weight of 4 is not 50% more than 3.

Of course it’s obvious but it wasn’t so obvious to me before this experiment!

Older Posts »

Powered by WordPress