Tue Jul 23 16:52:33 2024



NYLXS Mailing Lists and Archives
NYLXS Members have a lot to say and share but we don't keep many secrets. Join the Hangout Mailing List and say your peice.

DATE 2014-11-01


2024-07-23 | 2024-06-23 | 2024-05-23 | 2024-04-23 | 2024-03-23 | 2024-02-23 | 2024-01-23 | 2023-12-23 | 2023-11-23 | 2023-10-23 | 2023-09-23 | 2023-08-23 | 2023-07-23 | 2023-06-23 | 2023-05-23 | 2023-04-23 | 2023-03-23 | 2023-02-23 | 2023-01-23 | 2022-12-23 | 2022-11-23 | 2022-10-23 | 2022-09-23 | 2022-08-23 | 2022-07-23 | 2022-06-23 | 2022-05-23 | 2022-04-23 | 2022-03-23 | 2022-02-23 | 2022-01-23 | 2021-12-23 | 2021-11-23 | 2021-10-23 | 2021-09-23 | 2021-08-23 | 2021-07-23 | 2021-06-23 | 2021-05-23 | 2021-04-23 | 2021-03-23 | 2021-02-23 | 2021-01-23 | 2020-12-23 | 2020-11-23 | 2020-10-23 | 2020-09-23 | 2020-08-23 | 2020-07-23 | 2020-06-23 | 2020-05-23 | 2020-04-23 | 2020-03-23 | 2020-02-23 | 2020-01-23 | 2019-12-23 | 2019-11-23 | 2019-10-23 | 2019-09-23 | 2019-08-23 | 2019-07-23 | 2019-06-23 | 2019-05-23 | 2019-04-23 | 2019-03-23 | 2019-02-23 | 2019-01-23 | 2018-12-23 | 2018-11-23 | 2018-10-23 | 2018-09-23 | 2018-08-23 | 2018-07-23 | 2018-06-23 | 2018-05-23 | 2018-04-23 | 2018-03-23 | 2018-02-23 | 2018-01-23 | 2017-12-23 | 2017-11-23 | 2017-10-23 | 2017-09-23 | 2017-08-23 | 2017-07-23 | 2017-06-23 | 2017-05-23 | 2017-04-23 | 2017-03-23 | 2017-02-23 | 2017-01-23 | 2016-12-23 | 2016-11-23 | 2016-10-23 | 2016-09-23 | 2016-08-23 | 2016-07-23 | 2016-06-23 | 2016-05-23 | 2016-04-23 | 2016-03-23 | 2016-02-23 | 2016-01-23 | 2015-12-23 | 2015-11-23 | 2015-10-23 | 2015-09-23 | 2015-08-23 | 2015-07-23 | 2015-06-23 | 2015-05-23 | 2015-04-23 | 2015-03-23 | 2015-02-23 | 2015-01-23 | 2014-12-23 | 2014-11-23 | 2014-10-23

Key: Value:

Key: Value:

DATE 2014-11-26
FROM Ruben
SUBJECT Subject: [LIU Comp Sci] =?UTF-8?B?UmU6IERhdGFiYXNlIE1hbmFnZW1lbnQgU3lzdGVtczogQ1MgNjQ5IEE=?=
From owner-learn-outgoing-at-mrbrklyn.com Wed Nov 26 01:47:01 2014
X-Original-To: archive-at-mrbrklyn.com
Delivered-To: archive-at-mrbrklyn.com
Received: by mrbrklyn.com (Postfix)
id A52E3161154; Wed, 26 Nov 2014 01:47:01 -0500 (EST)
Delivered-To: learn-outgoing-at-mrbrklyn.com
Received: by mrbrklyn.com (Postfix, from userid 28)
id 888E916115B; Wed, 26 Nov 2014 01:47:01 -0500 (EST)
Delivered-To: learn-at-nylxs.com
Received: from mail-qc0-f179.google.com (mail-qc0-f179.google.com [])
by mrbrklyn.com (Postfix) with ESMTP id 80FB0161154
for ; Wed, 26 Nov 2014 01:47:00 -0500 (EST)
Received: by mail-qc0-f179.google.com with SMTP id c9so1599413qcz.38
for ; Tue, 25 Nov 2014 22:46:58 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20130820;
X-Gm-Message-State: ALoCoQnKsCdv1/huKM1/DeIoh0ggS+7ke7WoeBC+8ZP+tGKuvMrxCvKoY4Vf9ZD3VhIfDGMKF6eu
X-Received: by with SMTP id 7mr44497875qaj.64.1416984417866;
Tue, 25 Nov 2014 22:46:57 -0800 (PST)
Received: from [] ([])
by mx.google.com with ESMTPSA id f3sm3232261qag.49.2014.
(version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
Tue, 25 Nov 2014 22:46:57 -0800 (PST)
Message-ID: <547577AE.50205-at-my.liu.edu>
Date: Wed, 26 Nov 2014 01:48:14 -0500
From: Ruben
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.0
MIME-Version: 1.0
To: Ping.Chung-at-liu.edu, samir Iabbassen ,
Subject: [LIU Comp Sci] =?UTF-8?B?UmU6IERhdGFiYXNlIE1hbmFnZW1lbnQgU3lzdGVtczogQ1MgNjQ5IEE=?=
References: <2013137211.2780.1416658541481.JavaMail.bbuser-at-b-ap1b.liu.edu>
In-Reply-To: <2013137211.2780.1416658541481.JavaMail.bbuser-at-b-ap1b.liu.edu>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Sender: owner-learn-at-mrbrklyn.com
Precedence: bulk
Reply-To: learn-at-mrbrklyn.com

On 11/22/2014 07:15 AM, Ping.Chung-at-liu.edu wrote:
> CS 649 Database Management Systems Fall 2014
> Instructor: Prof. Ping-Tsai Chung
> Homework – Relational Algebra, Relational Calculus, and Normalization
> (Total: 200 Points) Due: Dec. 10, 2014 (One day before our Thursday class)
> Send your file to pingtsaichung-at-gmail.com

Why Normalization Failed to Become the Ultimate Guide for Database

While trying to find marshall
's claim
that Alberto
Mendelzon says the universal relation is an idea re-invented once every
3 years (and later finding a quote by Jeffrey Ullman that the universal
relation is re-invented 3 times a year), I stumbled across a very
provocative rant by a researcher/practitioner: Why Normalization Failed
to Become the Ultimate Guide for Database Designers?
by Martin
Fotache. It shares an interesting wealth of experience and knowledge
about logical design. The author is obviously well-read and unlike usual
debates I've seen about this topic, presents the argument thoroughly and

The abstract is:

With an impressive theoretical foundation, normalization was
supposed to bring rigor and relevance into such a slippery domain as
database design is. Almost every database textbook treats
normalization in a certain extent, usually suggesting that the topic
is so clear and consolidated that it does not deserve deeper
discussions. But the reality is completely different. After more
than three decades, normalization not only has lost much of its
interest in the research papers, but also is still looking for
practitioners to apply it effectively. Despite the vast amount of
database literature, comprehensive books illustrating the
application of normalization to effective real-world applications
are still waited. This paper reflects the point of view of an
Information Systems academic who incidentally has been for almost
twenty years a practitioner in developing database applications. It
outlines the main weaknesses of normalization and offers some
explanations about the failure of a generous framework in becoming
the so much needed universal guide for database designers.
Practitioners might be interested in finding out (or confirming)
some of the normalization misformulations, misinterpretations,
inconsistencies and fallacies. Theorists could find useful the
presentation of some issues where the normalization theory was
proved to be inadequate, not relevant, or source of confusion.

The body of the paper presents an explanation for why practitioners have
rejected normalization. The author also shares his opinion on
potentially underexplored ideas as well, drawing from an obviously
well-researched depth of knowledge. In recent years, some researchers,
such as Microsoft's Pat Helland, have even said Normalization is for

(only to further this with later formal publications such as advocating
we should be Building on Quicksand
). Yet, the PLT
community is pushing for the exact opposite. Language theory is firmly
rooted in formal grammars and proven correct 'tricks' for manipulating
and using those formal grammars; it does no good to define a language if
it does not have mathematical properties ensuring relaibility and
repeatability of results. This represents and defines real tension
between systems theory and PLT.

I realize this paper focuses on methodologies for creating model
primitives, comparing mathematical frameworks to frameworks guided by
intuition and then mapped to mathematical notions (relations in the
relational model), and some may not see it as PLT. Others, such as Date,
closely relate understanding of primitives to PLT: Date claims the SQL
language is to blame and have
gone to the lengths of creating a teaching language, Tutorial D, to
teach relational theory. In my experience, nothing seems to effect lines
of code in an enterprise system more than schema design, both in the
data layer and logic layer, and often an inverse relationship exists
between the two; hence the use of object-relational mapping layers to
consolidate inevitable problems where there will be The Many Forms of a
Single Fact (Kent, 1988).
Mapping stabilizes the problem domain by labeling correspondances
between all the possible unique structures. I refer to this among
friends and coworkers as the N+1 Schema Problem, as there is generally 1
schema thought to be canonical, either extensionally or intensionally,
and N other versions of that schema.

*Question: Should interactive programming languages aid practitioners in
reasoning about their bad data models, (hand waving) perhaps by modeling
each unique structure and explaining how they relate?* I could see
several reasons why that would be a bad idea, but as the above paper
suggests, math is not always the best indicator of what practitioners
will adopt. It many ways this seems to be the spirit of the idea behind
such work as Stephen Kell's interest in approaching modularity by
supporting evolutionary compatibility between APIs (source texts) and
ABIs (binaries), as covered in his Onward! paper, The Mythical Matched
Modules: Overcoming the Tyranny of Inflexible Software Construction
Similar ideas have been in middleware systems for years and are known as
/wrapper architecures/ (e.g., Don’t Scrap It, Wrap It!
), but haven't seen much PLT
interest that I'm aware of; "middleware" might as well be a synonym for
Kell's "integration domains" concept.

By Z-Bo at 2010-01-09 00:24 |
Critiques | History
| other blogs
26452 reads

Comment viewing options

Select your preferred way to display the comments and click "Save
settings" to activate your changes.

live programming

This is sort of related to one of my principle for live programming: the
program should always run in a reasonable, even if it has errors in it.
That their is code and that something was specified should always be
apparent in the program, even if what the code does is undefined because
of its erroneous state. Likewise, defaults should be reasonable so that
we can see things; e.g., if you create a rectangle and forget to set its
size, it should not be invisibly small (like in WPF...), but rather
something you can see and remember...oops I forgot to set the size.
Likewise, NaN shouldn't mean put fly off the screen into imaginary
space, perhaps you could just start shaking or something. The point is
to provide visible feedback so the programmer can more quickly
understand what's wrong.

Likewise, why are systems so brittle? In PL, we expect that a program
has one rigid unambiguous meaning, which means that any bug/mistake will
cause the system to explode vs. just degrading gracefully. So let's say
you fail to read a file because it doesn't exist...why not just log the
error and return some random file anyways? Sometimes, it won't even
matter. Martin Rinard's work on run time software patching comes to mind
here; e.g., the Living in the comfort zone

paper from Onward 2007.

This is not mainstream PLT, but maybe it should be. At any rate, the
systems community are pragmatic enough that they are exploring this area
fairly well.

By Sean McDirmid at Sat,
2010-01-09 05:25 | login or
register to post comments

Normalization Failed?

Seems to me like academic twaddle.

In practice (as opposed to the world of Date) a good understanding of
3rd normal form is the essential starting point for any database
designer. That is analysis, not design.

And then, as every serious analyss and design methodology has explained
since 1980, you denormalise to support the required processes - as need be.

By grahamberrisford at Sat,
2010-01-09 15:14 | login or
register to post comments

I agree

3NF (actually, BCNF) is extremely helpful, especially given the recent
changes in hardware (solid state disks) and database research (Adam
Marcus's MIT masters thesis on heap file structures suitable to
"navigable" relational databases; see BlendDB: Blending Table Layouts to
Support Efficient Browsing of Relational Databases
). In my
books, if the hardware guys can solve the "write problem" with solid
state (which I don't believe they have), then you will see a dramatic
reshaping of scaling practices. Solid state is simply a gamechanger; it
removes the "denormalize for performance" advice from the equation,
because with constant time disk access, redundant data actually slows
clusters of disks down!

I am not just pitching this topic out there. I am fairly well-read in
relational database theory. You can't just call it academic twaddle.
There is real tension between systems theory, database systems theory,
and PLT views on how to best solve problems. See: The Great MapReduce
Debate and
the follow-up Mike Stonebraker's counterarguments to MapReduce's
Obviously, head technical folks at Google were very much in disagreement
with Stonebraker, calling his comparison a "category error
" and saying
Stonebraker is no longer on the cutting edge (mind you, Stonebraker has
the best track record for start-up ventures using cutting edge research
of anybody in IT history; this was like saying Brett Favre should just
retire). Outside of Google, others criticized Stonebraker
as well. To me,
this seems like a modularity problem with database systems, and opens
the gateway for using MapReduce-like techniques to help build SELF-*
based systems.

as every serious analyss and design methodology has explained since
1980, you denormalise to support the required processes - as need be.

Understanding behavioral requirements (processes) is non-trivial,
especially in the face of mergers and acquisitions. This is why model
checking tools like Alloy exist (and are based on relational logic).
Where I work, we try to avoid enterprise-style integration wherever
possible. For clients that don't need it, it is simply more costly and
just a development hassle. I agree with Stonebraker here; there is just
too much middleware

I think my pet peeve is one of the things I talked about this
morning in my invited talk at SIGMOD 2002: there is just too much
middleware. The average corporation has bought a portal system, has
bought an enterprise application integration system, has bought an
ETL (Extraction, Transformation, and Loading) system, has bought an
application server, maybe has bought a federated data system. All of
these are big pieces of system infrastructure that run in the middle
tier; they have high overlap in functionality, and are complicated,
and require system administrators. The average enterprise has more
than one of all of these things, and so they have this spaghetti
environment of middleware, big pieces of moving parts that are
expensive to maintain and expensive to use.

Everyone seems to recognize this problem, and the conventional
commercial wisdom is to expand the role of an application server so
it does what all of these packages do. Web Sphere, for example, from
IBM, is becoming a very, very rich package which does a lot of
middleware functionality.

I think a federated database system is a much better base on which
to build a middleware platform than is an application server. And
the reason is that application servers only manage code, and then
the data is relegated to the bottom tier. If an application needs
some data, it runs in the middle tier and requests data from the
bottom tier. You end up moving data to the code. If you had a
federated data system, so that the data manager was running at the
middle tier and at the bottom tier---and object-relational engines
are perfectly happy to store and activate functions--- then code and
data could be co-mingled on a platform. And you could then do
physical database design in such a way that you put the data near
the code that needed it, and you wouldn’t end up shipping the data
to the code all the time. I think that’s a much more long-term,
robust way to build sophisticated middleware. So I’d work on
trying to prove that that was a good idea if I had some more cycles
at work---but I don’t.

  1. 2014-11-08 Ruben <ruben.safir-at-my.liu.edu> Subject: [LIU Comp Sci] Re: Welcome to learn
  2. 2014-11-08 Ruben <ruben.safir-at-my.liu.edu> Subject: [LIU Comp Sci] second post!!
  3. 2014-11-22 Ruben Safir <mrbrklyn-at-panix.com> Subject: [LIU Comp Sci] Oracle Files for the Homework and Oracle Resources
  4. 2014-11-22 Ruben Safir <mrbrklyn-at-panix.com> Subject: [LIU Comp Sci] Oracle Webineir on pl/sql
  5. 2014-11-22 Ruben Safir <mrbrklyn-at-panix.com> Subject: [LIU Comp Sci] UEFI and Secure Boot
  6. 2014-11-24 Ruben <ruben.safir-at-my.liu.edu> Re: [LIU Comp Sci] Oracle DBA short cuts
  7. 2014-11-24 Ruben Safir <mrbrklyn-at-panix.com> Subject: [LIU Comp Sci] Oracle DBA short cuts
  8. 2014-11-24 Ruben Safir <mrbrklyn-at-panix.com> Subject: [LIU Comp Sci] Re: Database Management Systems: DBMS Announcement on Nov. 23
  9. 2014-11-26 Ruben Safir <mrbrklyn-at-panix.com> Re: [LIU Comp Sci] Problems with Normalization
  10. 2014-11-26 Ruben <ruben.safir-at-my.liu.edu> Subject: [LIU Comp Sci] =?UTF-8?B?UmU6IERhdGFiYXNlIE1hbmFnZW1lbnQgU3lzdGVtczogQ1MgNjQ5IEE=?=
  11. 2014-11-26 Ruben Safir <mrbrklyn-at-panix.com> Subject: [LIU Comp Sci] More problems with modeling and normalization
  12. 2014-11-26 Ruben Safir <mrbrklyn-at-panix.com> Subject: [LIU Comp Sci] Normalize because your professor said too
  13. 2014-11-26 Ruben Safir <mrbrklyn-at-panix.com> Subject: [LIU Comp Sci] Problems with Normalization
  14. 2014-11-26 Ruben Safir <mrbrklyn-at-panix.com> Subject: [LIU Comp Sci] Why Data Models Shouldn't Drive Object Models (And Vice Versa)
  15. 2014-11-26 Ruben Safir <mrbrklyn-at-panix.com> Subject: [LIU Comp Sci] Why Normalization Failed to Become the Ultimate Guide for Database
  16. 2014-11-29 Ruben Safir <mrbrklyn-at-panix.com> Subject: [LIU Comp Sci] Memory Cache theory Architecture Class
  17. 2014-11-30 Ruben Safir <mrbrklyn-at-panix.com> Subject: [LIU Comp Sci] Cache Model in C programming

NYLXS are Do'ers and the first step of Doing is Joining! Join NYLXS and make a difference in your community today!