As we are nearing the end of a very interesting year, I was mesmerizing about the changes we saw this year. This took me back to what I’d later would call the very beginning of what would become my personal IT-Career, animals were still able to talk and we were all happily typing away on those awesome 16-bit 80286 machines. Salespersons told us that we’d never be able to fill that enormous 40MB Ultra DMA/33 hard disk, nor would we ever run out of that 1MB of ram, hence the now famous quotes “DOS addresses only 1 Megabyte of RAM because we cannot imagine any applications needing more.” and the even more famous “640 kB is enough”
We all worked in either Microsoft Works of Lotus 1-2-3, both had an integrated spreadsheet, word processor and database program that we could use to process data and it ran with a minimum of 256k and a recommended 512k of memory. Well let’s say processing data was simple, simply because there wasn’t much data that we would want to analyze on a computer.
Although this process looked simple, it can be seen as the Mesopotamian era of data processing, people had to write their own datasets and were able to share insights by sharing the data on floppy disks. This meant the data was very small and suited to sustain a small insight or drive small decisions. The included “database” program. The database management system, while still a “flat file” (i.e., non-relational) even allowed the novice user to perform transformations through formulas and even create user-defined reports which then had to be copied as text to the clipboard.
Think further about this, I looked into the history of data as a whole, and found some fun facts I’d love to share with you all 😊
In 1974, the IBM San Jose Research center developed a relational DBMS, System R, to implement Codd’s concepts. A key development of the System R project was SQL. To apply the relational model, Codd needed a relational-database language he named DSL/Alpha. At the time, IBM didn’t believe in the potential of Codd’s ideas, leaving the implementation to a group of programmers not under Codd’s supervision, who violated several fundamentals of Codd’s relational model; the result was Structured English QUEry Language or SEQUEL.
When IBM released its first relational-database product, they wanted to have a commercial-quality sublanguage as well, so it overhauled SEQUEL, and renamed the basically-new language Structured Query Language (SQL).
In this era Robert Epstein, Mark Hoffman, Jane Doughty, and Tom Haggin founded Sybase (initially as System ware) They set out to create a relational database management system (RDBMS) that would organize information and make it available to computers within a network, 1986: Sybase enters into talks with Microsoft to license Data Server, a database product built to run on UNIX computers. Those talks led to a product called Ashton-Tate/Microsoft SQL Server 1.0, which shipped in May 1989.
Remember the average computer in 1985 was a IBM compatible x80286 chip and had:
Now let’s fast forward to the mid-nineties, machines were still following More’s law and somewhere on tour de route the animals had lost their speech. IBM unleashed DB2, or IBM Database 2 in 1983 on its MVS mainframe platform, Microsoft Released SQL Server 6, Oracle had created the first read-consistent database in 1994 and Intel introduced the Pentium Series CPU, in many ways this lead to a revolution switching from a 32 (486) to an 64-bit external data bus (the Pentium P5) machines gave us the possibility to use more resources, and machines slowly became more powerful.
Remember the average PC on 1995 was a Pentium chip and had:
This context is important because it is in this world that Ralph Kimball wrote his first edition of “The Data Warehouse Toolkit”. [ISBN 978-0-471-15337-5]. Datasets were still reasonably small but the hardware wasn’t able to handle even these smaller datasets. In Ralph Kimball’s dimensional design approach (also called the bottom-up design) data marts facilitating reports and analysis are created first; these are then combined together to create a broad data warehouse. Keeping in mind the most important business aspects or departments, data marts are created first. These provide a thin view into the organizational data and, as and when required, these can be combined into a larger data warehouse. Kimball defines data warehouse as “a copy of transaction data specifically structured for query and analysis”.
Kimball’s data warehousing architecture focuses on ease of end-user accessibility and provides a high level of performance to the data warehouse.
After everyone had survived the Millennium bug, and the absence of any cataclysmic disasters, a new mayor revolution changed the Data Processing world, AMD release their x86-64 processors (Opteron). It introduces two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging mode. With 64-bit mode and the new paging mode, it supported vastly larger amounts of virtual memory and physical memory than what was possible on its 32-bit predecessors, allowing programs to store larger amounts of data in memory.
Until then SQL Server 2000 (32-bit) could only access as much as 32GB of memory, but needed to use Address Windowing Extensions (AWE) to access any memory it needs beyond the virtual memory limit. The important point about how SQL Server used AWE memory was that only the database page cache (i.e., data and index pages) could utilize physical memory outside the virtual memory of the process. All the other uses of memory—including plan cache, query workspace memory, locks, and other structures such as user connections, cursors, and space used by utilities such as backup and restore—were still limited to the virtual address space. Therefore, if your SQL Server application faces memory pressure in these parts of the system, adding additional memory beyond 3GB on 32-bit systems didn’t yield enough benefits.
To cope with that Microsoft releasing the (almost unknown) IA64 version of SQL server 2000, but in 2005 it finished the complete revision of the old Sybase code into Microsoft code and released SQL Server 2005. This would become the first x64 edition of SQL Server.
With this change Microsoft effectively removed the memory boundaries the data platform was heavily suffering from.
While invented in 1978 (Not a typo using DRAM), the real break true of the SSD’s came with the introduction of MLC cells in Enterprise Flash Drives or EFDs in approximately 2012 finally broke the IOPS barrier in a consistent way. As SSDs do not need to spin or seek to locate data, they proved vastly superior to HDDs in random read tests. However, SSDs had challenges with mixed reads and writes, and their performance may degrade over time. Most of the advantages of solid-state drives over traditional hard drives are due to their ability to access data completely electronically instead of electromechanically, resulting in superior transfer speeds and mechanical ruggedness. Also to achieve the same throughput as a single SSD, you needed between 12 and 24 HD’s in a RAID 1+0 configuration. In serial IO these configurations could outperform a single SSD and were more resilient to data loss.
Once you started to really use SSD’s where the IOPS bottlenecks lived, the SSD equipped servers started to outperform disk arrays easily.
The other game changer was… Azure.
The original Azure host operating system was a fork of the Windows OS called the ‘Red Dog OS’. Azure was pioneering functionality important to data centers everywhere. Running a fork of an OS is not ideal (in terms of the additional cost and complexity), so the Azure team talked to the Windows team.
During Build 2014 the Azure team announced over 40+ updates to the Microsoft Azure cloud platform, including significant enhancements to Virtual Machines, Storage, Web Sites, Mobile Services and Azure SQL Database. In addition, a new Azure Preview Portal was released that provides a customizable end-to-end DevOps experience along with Azure Resource Manager, a new framework for composing and managing cloud applications.
Also Windows Azure got renamed to Microsoft Azure to reflect the strategy and focus on Azure as an enterprise-grade cloud platform that supports one of the broadest set of operating systems, languages and services of any public cloud.
This was just the beginning though…
Azure Gen 2 Is rolling out, and now the data platform is ready to shake your world, as it will challenge you to question the old gods and the old ways of working.
My personal feeling about this: Challenge Accepted!
© 2023 Kohera
Crafted by
© 2022 Kohera
Crafted by
Cookie | Duration | Description |
---|---|---|
ARRAffinity | session | ARRAffinity cookie is set by Azure app service, and allows the service to choose the right instance established by a user to deliver subsequent requests made by that user. |
ARRAffinitySameSite | session | This cookie is set by Windows Azure cloud, and is used for load balancing to make sure the visitor page requests are routed to the same server in any browsing session. |
cookielawinfo-checkbox-advertisement | 1 year | Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category. |
cookielawinfo-checkbox-analytics | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics". |
cookielawinfo-checkbox-functional | 11 months | The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional". |
cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-others | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other. |
cookielawinfo-checkbox-performance | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance". |
CookieLawInfoConsent | 1 year | CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie. |
elementor | never | The website's WordPress theme uses this cookie. It allows the website owner to implement or change the website's content in real-time. |
viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
Cookie | Duration | Description |
---|---|---|
__cf_bm | 30 minutes | Cloudflare set the cookie to support Cloudflare Bot Management. |
pll_language | 1 year | Polylang sets this cookie to remember the language the user selects when returning to the website and get the language information when unavailable in another way. |
Cookie | Duration | Description |
---|---|---|
_ga | 1 year 1 month 4 days | Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors. |
_ga_* | 1 year 1 month 4 days | Google Analytics sets this cookie to store and count page views. |
_gat_gtag_UA_* | 1 minute | Google Analytics sets this cookie to store a unique user ID. |
_gid | 1 day | Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously. |
ai_session | 30 minutes | This is a unique anonymous session identifier cookie set by Microsoft Application Insights software to gather statistical usage and telemetry data for apps built on the Azure cloud platform. |
CONSENT | 2 years | YouTube sets this cookie via embedded YouTube videos and registers anonymous statistical data. |
vuid | 1 year 1 month 4 days | Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos on the website. |
Cookie | Duration | Description |
---|---|---|
ai_user | 1 year | Microsoft Azure sets this cookie as a unique user identifier cookie, enabling counting of the number of users accessing the application over time. |
VISITOR_INFO1_LIVE | 5 months 27 days | YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface. |
YSC | session | Youtube sets this cookie to track the views of embedded videos on Youtube pages. |
yt-remote-connected-devices | never | YouTube sets this cookie to store the user's video preferences using embedded YouTube videos. |
yt-remote-device-id | never | YouTube sets this cookie to store the user's video preferences using embedded YouTube videos. |
yt.innertube::nextId | never | YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen. |
yt.innertube::requests | never | YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen. |
Cookie | Duration | Description |
---|---|---|
WFESessionId | session | No description available. |