The Thought Blender

A CSConnell weblog

Data – The Foundation of the Internet Food Pyramid

Learning From the Past

I started working 20 years ago, and while times were very different (Internet access was pretty limited, with only 3 million users worldwide, and nobody knew what software as a service would really mean) some things haven’t changed that much.

My first job was with an engineering company, and one of the things that we did there was to help electric providers access and manage their data.  All of these organizations were responsible in some way for generating piles of data.  Not only the information about how their facilities were licensed and built, but notes, documents, specifications and every bit of information you could imagine related to maintaining, operating, and modifying the facility.

It was great, these companies had tons of data about everything they had ever done – they had the power of data (pun intended).  Well, in theory anyway. The problem for most of these companies was twofold.  First, most of them did not have this direct access to this data.  Even though it was about them, many times it wasn’t considered theirs.  The engineering and maintenance firms either considered it their own or convinced the power providers that they (the engineering companies) were in the best position to hold onto this data, usually in systems to which only the engineering firms had access.

At best the electric provider would have a room full of paper documents that reflected the work that had been done.   This made getting access to that information a time consuming, and sometimes frustrating and costly exercise.  This led to the second problem – this data was primarily at rest.  In other words, since accessing the data was time consuming and costly, no one had any incentive to use it to do anything until they absolutely needed to, so the data just sat there most of the time.

Some of our more forward-thinking customers leveraged our services to address these issues, building systems where they could capture; store, retrieve, and more effectively use all of this data.  That gave them real power.  They were in a better position not only to respond to both internal and external requests, they were also in a much better position to pro-actively improve their facilities and processes in a way that they had not previously had the stomach to attack due to the extra time and costs.

Apply Those Lessons Now

Everything happens in cycles, and looking at businesses on the Internet today we see some of the same problems.  Sure, we have widespread connectivity, new tools, high-powered servers and software capable of creating and churning through terabytes of data, but what is anybody doing with it?  Many companies out there find they are not in that different of a position than those electric providers from 20 years ago, collecting piles of data in any number of systems or logs.  But once it is there, what is anyone doing with it?  What are you doing with it?

Take Charge of Your Data

At some point in the lifespan of any company you need to be asking yourself about your data strategy.  You need to think about all of the opportunities that data can afford you, from providing a better service, to attracting more customers, and ultimately improving the bottom line of your company.  Chances are that if you are not answering these questions one of your competitors probably is giving it some consideration, and while I for one do not put much stock in the “competitor x is doing y, so I need to as well” line of thinking, ignoring your data strategy is doing a disservice to both you and your customers.

Today there are any number of tools and vendors out there that will tell you that they can help you with these kinds of data management and usage problems.  One tool that you can add to your arsenal for executing against a data strategy is a Data Management Platform (DMP).  ”Great…” you are probably thinking, another buzzword and another vendor to worry about. “What is a DMP and how can one help me anyway?”

Pick the Right Tools and Partners

First, I would suggest that you do not view whomever you choose to work with as a vendor, but rather that you view them (and they view you) as a partner.  Now that we got that out of the way, I am going to ignore any number of the definitions that you might be able to find out there from vendor or industry sites, and I’m going to tell you what I think it is and what it can do to help you.

I view a DMP as a platform that helps you:

  • Determine how your customers or users interact with you through any number of contact points, including a desktop browser, a mobile device, or even other applications you may use in your enterprise;
  • Combine this data with data you may have from other platforms such as a CRM or POS or third party data;
  • Allow you to view that data in a way that makes sense to you;
  • Analyze and report on that data so that you can visualize the relationships between the various ways that your users interact with you;
  • Use all of that data information to create a better user experience for that user, whether that be showing them more targeted content, serving them a more relevant ad, or even provide them a better purchasing or call center experience;
  • Use the data any way you see fit – it is your data after all.

You may choose to use any number of those types of features in combination or note use any of them at all, but a DMP should ultimately provide those options so that it can support your organization as it grows and evolves.  Elaborating a bit from the points above, a DMP shouldn’t be a closed system that collects your data and just sucks it into a black box where you don’t get many, if any choice, of exactly how to view it or use it.  Instead a DMP should be an open platform that allows you to combine data from multiple sources, and allows you to use that data any way you want through APIs and integrations.  You don’t want to be one of those electric providers mentioned above that always had to ask the engineering firm permission to get their own data.  Instead you want to be that company that is in charge of your own data and strategy.   You should be in charge, and you should have options to control the data that is collected, the classification or categorization and lifespan of that data, and the use of the data.  A system that provides defaults for you is great.  But a system that has defaults and provides you the option of customizing them is even better.  One that offers all of that and is easy to use and offers some level of transparency into how the system works is best.

Remember Privacy

And let’s not forget the ePrivacy Directive in Europe, which makes ownership of all data emanating from a given site the responsibility of that site’s owners.  If you do not control your data, but cede even a small amount of control to other companies who pixel your site, the ePrivacy directive is clear – this will become your liability.  It is vital to understand all of the tracking on your own site.  Work with the partner you choose to set up a system to regularly monitor and audit all the code on your sites.  This is more than just a “cookie audit.”  Much of the tracking covered by the ePrivacy Directive doesn’t use cookies at all.  You need to know the actual scripts that run on your pages. If you haven’t obtained a full tracking audit recently, be sure this is your first step.  You’ll be surprised by the results.  Once complete, you’ll need to categorize each tracker as essential or non-essential, and then rank them on a scale of relative intrusiveness.  Bake this into your data strategy at the onset of your planning.

Put Your Data to Work

Once you have a data strategy and start working with a DMP, then you can put that data to work for you, moving from a state of rest to a state of competitive advantage.  Build a strategy, pick a DMP partner, and wow all of your customers through providing them great content, meaningful targeted ads that they are more likely to find interesting, or even person-to-person customer support.

February 2, 2012 Posted by | Data, Data Management Platform | , , , | Leave a Comment

Choosing a Portal Platform

Why Look at Portals?

Along with our recent effort to build a collection of common/shared Identity and Security Services for our products leveraging an LDAP / Directory server (see posts on LDAP comparison) and a security framework (more on this in another post), we are also working to carry the service paradigm forward into the user interface.   I don’t feel like portals really get as much attention as they deserve in the service oriented architecture (SOA) world, while other things that aren’t really service oriented at all end up being associated with the SOA buzz (see my thoughts on Service Oriented Architecture).  Having said that, it seems to me that portals can provide an especially good representation of services and allow for aggregate applications in a way that most other presentation options don’t.  You can really gain quite a bit of flexibility by creating portlets that leverage either one or many services.

The rest of this post will talk about some of our high level requirements for choosing a portal and what we evaluated.  I’ll tell you what made the short list, what didn’t and why, and then followup with another post that discusses our short list candidates and how we picked a winner.

The Requirements

Our overall goal is to be able to use the portal to create a flexible product that we then deliver over the internet.  We don’t sell the product itself – we only sell the services that our products deliver, and then the ability to get to our data using our flexible front end (soon to be available in a portal).  Some of our requirements should be things that all people consider, while others, such as the fit to our particular architecture and technology stack, will be more specific to us.  Keeping that in mind, here were some of our high level requirements:

  • Standards are important, so we want JSR-168 / JSR 286 and WSRP support.  WSRP support, in my opinion, is better even than the JSR support, because it allows for a better dispersal of portlets and services.  Each  JSR portlet needs to reside on the portal server, whereas only one WSRP implementation need be resident on the portal – all the WSRP portlet “guts” can be anywhere else.  Having said that, the WSRP and WSRP2 portlet implementation are normally JSR compliant porlets themselves.
  • Support for LDAP directly or through a security framework, with as little duplication of storage as possible.  Plus, the LDAP server should be the source of as much of the identity data as possible.
  • Customization and configurability are key.  The end user need not even know they are really using a portal … at the very least, the portal should be branded and have a look and feel that says it is a Lotame product and not look like someone else’s.  We may also need to change the layouts, the login, and remove components that we don’t need.
  • Ease of use.  This falls into two categories – end users and developers.
    • End users need to be able to use whatever we create.  While much of that falls to us, the portal framework also plays a significant role.
    • Developers should be able to figure out how to use the portal, how to configure it, how to customize, add new pages, content and portlets without having to spend days and days just to get started.  While it may take a while to become a portal “expert”, you should be able to get up and running in a couple of hours.
  • It should fit nicely into our technology stack.  We run a Linux and MySQL environment here.  Anything that doesn’t run on those is out!  The bulk of our development is done in Java and PHP and we prefer not to add any new languages if it can be avoided.  You can begin to see a trend here, we also prefer open source products so we will be sticking with products that are either open source or otherwise very affordable.
  • It should also fit into architectural roadmap.  At a high level we are choosing a portal because conceptually it does fit with our vision of creating aggregate applications through use and reuse of services, but the specific portal we choose needs to be a good long term choice based on where we are taking our entire product line.
  • Some document management or CMS ability would also be nice to have.
  • Good documentation and community support.

The Contenders

We investigated or otherwise considered the products below.  For those of you who read this and think … “but there is a more recent version out now”, I recognize that, but this evaluation was done essentially in November of 2008.  There are also many other portals and portal frameworks out there, including ones for pay such as Vignette, BEA, IBM, Oracle and others.  The portals below were chosen because 1) they met our requirements above and 2) we were aware of them.

Product Version Link to Site
Jahia Community Edition 5 http://www.jahia.com/jahia/Jahia
Jetspeed 2 http://portals.apache.org/jetspeed-2/
JBoss Portal (Community Edition) 4.7 http://www.jboss.org/jbossportal/
eXo Portal Community Edition 2.2 http://www.exoplatform.com/portal/public/en/
Light Portal ? Pre-1.0 https://light.dev.java.net/
Gridsphere 3.1 http://www.gridsphere.org/gridsphere/gridsphere
Liferay Portal Standard Edition 5.1.2 http://www.liferay.com

The First Pass

Coming into the evaluation I was the most excited about 3 portals – Jetspeed, eXo, and Liferay.  This was primarily due to some previous experiences as well as just a general respect for each of these communities.  I really didn’t know enough about the others to render much of an opinion one way or another with the possible exception of JBoss since we run the JBoss App Server and I knew that there was (at some point at least) some involvement work done by Novell on the JBoss Portal.  In any case, I went and downloaded each of the portals and gave them a try.  In many cases I also encouraged some of our developers to do the same so I could also learn from their experiences.

Jahia Community Edition

Jahia is a very nice product that meets most of the requirements that we have, incuding very good support for content management, JSR-168 portlets, good administrative tools, and good support for developers including support for Maven and Ant.  Jahia has a very strong offering, with only one problem for us … no support for LDAP in the Community version.  Seeing as how LDAP integration was one of our stated high level requirements, this was a problem.  Also, since we did not find any infromation readily available about integrating a security framework such as Acegi / Spring Security with Jahia (which would allow for LDAP integration via one level of indirection), this pretty much sealed Jahia’s fate.

Regardless of our decision, Jahia is a very good product.  You can find more information about its features here.

Light Portal

Per the website:

Light is an Ajax and Java based Open Source Portal framework which can be seamless plugged in to any Java Web Application or as an independent Portal server. One of its unique features is that it can be turned on when users need to access their personalized portal and turned off when users want to do regular business processes.

This is in fact true, it is a framework that shows some promise, but it was in fact a little too light for us.  In other words, it left us with a lot to do and it didn’t have any CMS capabilities, LDAP integration, etc.  It would seem that a lot of new work is going on and I just read a message here at the end of December that said 1.0 is on its way.  Not right for us, but I am interested in what happens with Light and I’ll keep watching.

Gridsphere

I ended up not spending too much time looking at Gridsphere.  It seems pretty straight forward to set up and use, but the feature base didn’t have everything I wanted (again LDAP, CMS).  There is some documentation out there that seems to be accurate, but honestly, there isn’t that much of it.  Gridsphere also was not right for us.

eXo Portal Community Edition

I was pretty excited about eXo.  When I was doing my first portal implementation in 2003 or so eXo wasn’t quite what we needed, but it was really coming along and I went into this evaluation looking forward to using it.  I downloaded both the All in One bundle as well as the 2.2 Portal Tomcat binary (both from here).  getting things up and running with these packages was very straight forward, and eXo looked really great right out of the box.  We even took a look at the WebOS /  desktop portion of the product in the All in One package, and it looks really cool (no something we need right now, but it works).  From a feature standpoint eXo looks really strong, particularly in their ability to deal in a flexible manner with LDAP models.  If you don’t have a directory tree already set up, then it will create a tree that supports what it needs to do.  If you have a tree already, eXo can be configured to work with it (see the LDAP config info).  eXo also has great standards support, including JSR-168 / JSR-286, WSRP 2, JSR-170 (for content), and supports the JAAS security framework.  On top of that it support AJAX, a REST API, drag and drop, customization, and bunches of flexibility.

eXo looked great and looked like it would make our short list for sure … until we tried to really use it.  I was continuously stumped by the interface when I was trying to create new pages, new portlet instances, and basically place our applications into the portal.  The interface was not working in any way that I expected and I could not seem to create a new instance of a porlet.  Assuming it was just me being thick, I had several other developers install it and take an independent look before I made any significant comments.  Each one had the same issue – they thought it was easy to install and looked great, but couldn’t figure out how to create new portlet instances.  I was sure that we were missing something in the setup, or missing something very simple right in front of our faces.  The documentation we were looking at was not able to help us along any.

Unfortunately, that led to eXo not making our short list.  I know there are bunches of people out there using eXo, and I am sure it is a fantastic platform and I can see all the work that has gone into it.  The problem is that our experience violated one of our high level requirements – that it be easy for developers to get up and get started without being a portal expert.  I am willing to chalk our experience up to some lack of understanding or missing of the obvious on our part, and certainly look forward to any info that we might get from eXo or eXo users.  While ultimately our choice has already been made, I want to understand what went wrong with our look at eXo and would be willing to consider it in the future.

Jetspeed 2

My personal experiences with portal have been very heavy in Jetspeed (and the implementation from Gluecode, which is now part of IBM – the Gluecode portal product no longer exists to the best of my knowledge) and Vignette, so I was also excited about Jetspeed.  If you want standards compliant software that is working to make sure to support integration with other Apache based software, then Jetspeed is it.  Jetspeed has support for most of the standards and supports LDAP quite nicely (they work well with ApacheDS).  I did have some questions about how flexible the support for LDAP is if I have a schema that differs significantly from the default LDAP schema.  It looks like there were custom objectclasses and attributes, which is something we were looking to stay away from.

In the end, Jetspeed 2 did not make our short list for a couple of reasons.

  • I found the existing workflow for creating portlets and pages to be a little more cumbersome than I had expected.  I worked extensively with Jetspeed 1, and a little with Jetspeed 2 when work on it was just starting.  I found the placement and workflow of the administrative functions to be cumbersome and not intuitive to any of the developers that looked at it.  I knew what to do based on previous experiences, but the other folks struggled a bit.  I realize we could customize it to make it better, but I want to focus on working on the look and feel and our own apps, which is what our users are going to see.  I don’t want to spend a ton of time working on the administrative interfaces above some limited identity type of work we will have to do in any portal.
  • The documentation is OK, but not great.

I love Apache, and have used many Apache products, but two areas the I personally feel many Apache products come up short is in presentation / UI (the first point above) and in documentation (second point).  This isn’t true for all Apache products, and it has gotten somewhat better, but documentation and UI are just not major strength of some of these folks.  That is one reason why I liked when Gluecode implemented a commercial version of Jetspeed – they cleaned it up a bit.  I also had some concerns about the flexibility of the LDAP model, but I didn’t dig deep enough to be able to address those concerns.

Apache is great, Jetspeed is great, but Jetspeed 2 did not make our short list based on our high level criteria.

The Short List

So, what did make our short list and why?  Well, the short answer is that Liferay and the JBoss Portal (Community Edition) were our short list candidates.  Liferay had always been a contender, but JBoss Portal made the short list toward the very end of the evaluation.   Both of these products have good documentation, good support for standards, support for LDAP and security frameworks, and strong community support.

I will be providing some more detailed information on why the made the short list and what portal we eventually chose, but this post has gotten quite long, so I will be providing that information in a followup post in the next couple of days (probably after the new year)

December 31, 2008 Posted by | Development, Portals | , , , , , , , , | 1 Comment

Comparison of LDAP / Directory Servers – Update

Almost two months ago I wrote a post about some directory servers I was testing, mostly I wrote about some early testing that I had done with OpenDS and OpenLDAP.  Those test results showed OpenDS performing better than OpenLDAP in an out of the box testing scenario.  I got some feedback from different folks, including Howard Chu who has been involved with OpenLDAP.  While I didn’t follow up directly with Howard on his tuning comments, I did do some tuning of both OpenLDAP and OpenDS.   I don’t have all of the test results in a presentable format, but I do have some additional findings.

Improving Performance

Both of these directory servers come tuned for developer use out of the box, which is to say that they are not really tuned in any way at all.  Instead they are configured to use as small a footprint as possible.  This makes a lot of sense, since the developers have no idea how much memory or process power you have and make an assumption that the first time you use it you are trying it out in an development or test environment.

Once I spent some more time on the OpenDS and OpenLDAP sites and tweaking the configuration of each, I was able to show improved performance in each.  Given the nature of our implementation, only a couple of hundred records right now and a fairly low number of requests, the performance difference between the two was negligible.   It is possible that we might see some more significant difference with a larger number of requests and more entries.

You can find more tuning information for OpenLDAP at: http://www.openldap.org/faq/data/cache/190.html

More tuning information for OpenDS is here:  https://www.opends.org/wiki/page/HowToTunePerformance

The Verdict – Take 2

Given the results were so close, did that alter my preference for OpenDS?  Nope.  We have been very happy with the test results and features from OpenDS.   OpenDS also fits very well into our architecture and technology stack.  Personally I am very comfortable with the tools and documentation for OpenDS, and the OpenDS team continues to improve both.

Final Thoughts

OpenDS works very well for us and matches what were were looking for very well, both from a technology standpoint and a community standpoint.  The OpenDS developers and community members are all very friendly and helpful.  They continue to make improvements in the software and documentation.

Having said that, there may be reasons why you would choose one of the other directory servers, so while you may use my experience as a guide, make sure that you compare the features, technology stack, and architecture to your own requirements.

I would recommend evaluating not only OpenDS, but also OpenLDAP, ApacheDS, and others such as Red Hat / Fedora Directory Server.  If you are in a Windows shop, any of the LDAP servers will work for you, but certainly Active Directory should be considered.  I also have a high level of respect for Novell’s eDirectory.  If you have a very large deployment, the eDirectory might be something you really want to consider.  Keep in mind that both Active Directory and eDirectory are both LDAP-compliant servers that offer features beyond an LDAP server, and may in fact differ from the LDAP specification in some areas.

December 30, 2008 Posted by | LDAP, Uncategorized | , , , , , , | 2 Comments

Lotame is Looking for a MySQL DBA / Data Architect

We are currently looking for a MySQL DBA / Data Architect at Lotame.  The job description is below and can also be found on our web site at on the careers page, where you can also apply and upload your resume.

MySQL DBA/Data Architect

Lotame is looking for a data mastermind to build the next generation of internet marketing technology specifically geared towards social networking websites like MySpace and Facebook. If you like to be intimately involved in the design and development of a company’s core technology, want to own projects and make a real difference in a small company, then Lotame wants you.

Candidates for this position must be thoroughly knowledgeable in database technologies, with specific knowledge of MySQL in a Linux environment.

The day to day:

  • Manage MySQL on Linux in Production/QA/Development environments utilizing open source technologies including LVM and DRDB.
  • Support system administrators and developers to ensure optimal system design
  • Ensure database systems perform reliably
  • Lead the design of all aspects of the data architecture, including data models, data flows, aggregations, clustering, replication, and back and recovery strategies
  • Develop and enforce enterprise wide database design standards
  • Support team in troubleshooting efforts

Qualifications:

  • BS in a technical field or significant relevant experience
  • 4+ years of database experience
  • Proficient in SQL
  • Experience in database performance tuning, backup and recovery, system architecture and design with MySQL
  • Strong understanding of system administration fundamentals in a Linux environment.
  • Ability to write scripts to support database administration activities.
  • Experience working with very large data sets
  • Experience with data sharding/partitioning is a major plus
  • Experience with MySQL stored procedures, functions, and triggers
  • Experience managing large, dynamic, storage using LVM.
  • Knowledge of multiple MySQL storage engines
  • Self-motivated with strong problem solving and analytic skills
  • Ability to handle multiple tasks

December 1, 2008 Posted by | Database, Jobs, MySQL | , , , , , | Leave a Comment

Comparison of Directory / LDAP Servers

A couple of weeks ago I began the exercise of investigation Directory Servers again.  This post provides some background on my previous experience, plus what I have found in my current investigation thus far.

The Past

I have some experience with Directory Servers from around 2001-2004, when I went through a very similar effort compared to what I am doing now, including a code evaluation of those that were open source.  At that time it really came down to whether or not it was appropriate for the company to use either LDAPd (now Apache DS) or OpenLDAP.  It came down to these two based primarily on the fact that they were more open or easier to install than some of the other contenders.  We looked at Sun Directory server, MS Active Directory, and Novell eDirectory, among others.  At the time the company company I worked for delivered a Windows based solution, but Active Directory just did not meet our needs, and the Sun Directory Server proved troublesome to install properly in our environment.  The Novell offering, while strong, relatively easy to install, and affordable (free for the first 1MM nodes at the time – we weren’t going to have that many) wasn’t the politically acceptable option and didn’t quite match up to our particular needs.

That left us with OpenLDAP and LDAPd, in part because they both supported database backends (we had a particular reason for requiring that).  Personally, I really liked LDAPd and offered a few code snippets to the project where I found some issues around the database backend modules.  I very much liked the idea of a true directory server that supported many different types of backends based on need, which I believe is what led Apache DS to be the encompassing project it is now.  The issue at that time was really more about maturity – was this project going to be able to support us moving forward?  On the other hand, OpenLDAP had history, a large install base, and unfortunately, seemed to have developers that honestly couldn’t have cared less about supporting the Windows environment.  On checking out the code, I found that it was impossible to build within Windows as it was.  That offered an experience to work in sussing through C conditional compile statements that I did not much relish, but never-the-less took on.  After some time fooling around with the code I did get it to compile, and was able to get the backend components to work with Oracle and MS SQL server as well.

While not thrilled about my Windows based experience with OpenLDAP, it was the winner.  It had the features we needed and it was solid and trusted.

The Situation Now

Now several years later I am back undertaking a similar effort at a different company and with a different set of requirements.  I am no longer constrained by the Windows environment as we run our entire environment in Linux.  I also don’t have the need (as far as I can tell right now) to support a database backend.  So, here we are again – OpenLDAP is still the same stalwart it has always been, Active Directory still doesn’t meet our needs (for different reasons), and I didn’t even bring up Novell this time (although I am still a fan).  The playing field has changed though – LDAPd is now ApacheDS, with a larger supporting cast, an increased scope, and a changed architecture.  The new kid on the block is OpenDS, which has come out of the Sun world, has had a 1.0 release this year, as well as some more recent 1.1 builds.  Given our platform and our requirements we decided to take a look at ApacheDS, OpenLDAP and OpenDS.  After some quick conversations with other folks out there, I decided to defer my test of ApacheDS and wait for some additional work to be done on the 1.5 branch to get some additional performance and “ready for prime-timeness” (I am happy to hear from the Apache DS folks if they believe I should reconsider).  So at the end of the day, I am off to install OpenDS and OpenLDAP ….

The Install

My test environment is CentOS, and I created two virtual servers on the same physical box with the exact same CPU and memory configurations.  The intent was to work with a more or less default install of each of the servers without performing any tuning, tweaking, or custom building.  CentOS always tends to be a few package versions back in their repositories, but I decided to use what was there for the sake of sticking with the rules above.  On the OpenDS side, I started with OpenDS 1.0, instead of the new builds in OpenDS 1.1.

From an install standpoint, OpenDS seemed quite a bit easier to me, even though I had never seen it before.  OpenLDAP was just exactly how I remembered it.  I did take the installs a bit further and went to set them up to use TLS for security.  Again, OpenDS was much easier to set this up with, I needed to do a lot more configuring with OpenLDAP and ended up deferring until later (we didn’t really need it for the test, or for our environmental set up in production either … for now.)

In terms of features, well again, LDAP is mostly LDAP.  However, one feature that I very much liked (and we need) from OpenDS was virtual attributes, specifically isMemberOf.  This is EXTREMELY useful to us, and the default install of OpenLDAP from the CentOS repos doesn’t have it.  To be fair, if I were to build the latest versions of OpenLDAP that are available, they do have the memberOf feature.  Based on the fact that we do not want to get into maintaining and building 3rd party code whenever possible, this ended up being a ‘+’ in the OpenDS column and a ‘-’ in the OpenLDAP column.

The verdict, I installed OpenDS using both the GUI installer as well as the command line interface, and used command line interface for OpenLDAP.  All things considered, setting up one LDAP server feels much like setting up any other, it isn’t that hard.  In the end though, I just felt OpenDS was easier, and would certainly be more straightforward for someone without much backround in the technology .. and it had the nifty isMemberOf attribute we wanted.

Testing by Hand

First I just wanted to make sure that the servers did what they should do.  I should be able to use any LDAP client to add, remove, search, and modify LDAP entries.  Everything worked fine using both the command line tools, and mostly OK using available GUI LDAP clients.  My client tool of choice is the Apache Directory Manager client.  For the purposes of this initial set of tests we decided to go the simple route – we won’t have thousands of objects in the LDAP store, maybe one day, but now we are talking about hundreds.  I only created a handful of users, groups, and orgs for this test.    The only caveat I had in my testing with this client was that I had some very specific problems connecting to the OpenDS server using a startTLS connection from a Linux client.  I did not have the same issue with the Windows client, and have had some Linux clients that I have been able to “clear up”, but I can’t explain why yet (see incident DIRSTUDIO-414 at the ApacheDS project).  I did not have this same problem connecting to an ApacheDS server (I did set one up when I ran into this problem – very easy!), and I still haven’t set up a secure OpenLDAP server yet, so I can’t provide any feedback on that.

My next step was to write some sample code to exercise OpenLDAP and OpenDS a little.  Again, no problems with either, including a secure connections using TLS to the OpenDS server (which makes the problem above even more perplexing).  I put in a little data, updated it etc.

The last thing I did was to do some data exports / imports from OpenDS to OpenDS, OpenLDAP to OpenLDAP, OpenLDAP to OpenDS, and OpenDS to OpenLDAP.  The only thing that was unexpected here was that OpenDS was placing UUID information into the LDIF, which OpenLDAP was NOT HAPPY with (as it shouldn’t be).  Removing the UUID from the LDIF took care of the issue.  After that, everything was great.

Automated Testing

My initial instinct was to go for my good old trusted friend JMeter.  I did run a couple of tests through JMeter before I decided that it was a little more work than I wanted to take on, so I went looking for another tool.  That is when I ran across SLAMD.  I have been very happy with SLAMD thus far, even though I know I have only scratched the surface. Keeping in mind that I don’t have a ton of data in the stores, I headed off to run on of the LDAP Load Generator projects.  I created two – one for OpenDS and one for OpenLDAP.  These tests ran continuously over an 8 hour period with 2 client machines running 40 threads apiece, with the first 10 minutes and last 10 minutes of the 8 hours was used as a warm up and cool down time. Some graphs from those tests are included below.

SLAMd Comparis on Overall Operations Attempted

SLAMd Comparison of Overall Operations Attempted

SLAMd Comparison of Overall Operation Time

SLAMd Comparison of Overall Operation Time

The Verdict

While I have not spent a ton of time doing any analysis on these results, and certainly have not doen any tweaking to affect them in any way, it would appear that OpenDS has better performance than OpenLDAP in our simple tests.  With a few exceptions, the overall variability of performance with OpenLDAP seemed to vary a little more widely than the OpenDS performance, which suprised me a little.  There were some very higly variable points with the OpenDS server, but they are very high peaks spread out over what looks like very consistant intervals.  I would tend to interpret these as garbage collection since OpenDS is java based.  As I said, I have only begun to really investigate these results, so I welcome any other interpretations or experiences that other have and will continue to do some further investigation, run additional tests, and test out synchronization.

November 4, 2008 Posted by | LDAP | , , , | 9 Comments

Why Social DevCamp East is Important

And Why You Should Attend

Social DevCamp East is being held in Baltimore again – the fall session was recently announced for November 1st.  

What is Social DevCamp?  

“Social DevCamp East is the Unconference for Thought Leaders of the Future Social Web ….. Social DevCamp East Fall 2008 once again invites east coast developers and technology business leaders to come together for a thoughtful discussion of the ideas and technologies that will drive the future of the social web.” – Social DevCamp Site

So, why should you care?  There a several reasons why this is an important event not only for developers and companies that are involved in social applications, but also for anyone interested in leading edge technologies and promoting the area ….  

Social DevCamp Provides a Great Opportunity to Learn About Technology

Many people still think of social networking as a niche area, but that simply isn’t true.  If you have been watching the technology news lately or read Andy Monfried’s blog you may have heard or read that 40% of internet traffic is on social networks. 40%!  That’s impressive, but what does it mean?  It means that in order to support that type of growth, to handle that much traffic and that many eyeballs, the social media related companies need to continue to push the envelope, advancing technology in ways that we haven’t done before.  They need to determine how to store more data, access it quicker, relate it better, and make it user friendly and useful enough for people to keep coming back.  The amount of data that many of these companies have to deal with is simply staggering.  Social media related companies are innovators, learning how to push existing technology to the limits and develop new technologies and strategies where needed.

The point is, that whether you are a social media developer or not, there are important things that you can hear about at Social DevCamp that you can take back to your company and use there.  Topics will include things like Web 3.0, Cloud Computing, Crowdsourcing, Building Out the Semantic Web, Mobile Development Best Practices, and other topics similar in nature.

Social DevCamp Provides a Great Opportunity to Network

This event is targeted to the East Coast, not just Baltimore.  That provides an opportunity to meet innovators from up and down the coast.  The Greater Baltimore Technical Council will be there as will people that have been involved with twittervision, the Mozilla Foundation, and others.  These are real people, with real ideas building and leveraging the latest technologies that will be sharing ideas in a setting conducive, and designed, for discussing ideas.  This is not a place where people are selling technology to you at a vendor show, but rather are talking about technology, where, and how it is being used.  What a great way to spend at least part of your day.

Social DevCamp Highlights the Great Opportunities and Companies on the East Coast

When people think of new technology, start-up companies, and people living on the bleeding edge of technology, they tend to think of the West Coast.  There are fantastic things going on out there, and there is quite a bit of innovation … but that doesn’t mean that there isn’t anything happening here in the East.  It is a mindset kind of thing  - looking for new technology, go west young man.

That isn’t true though, there are great things happening on both coasts.  The problem is, that not everyone knows about the great opportunities and the great small / start-up companies that are here.

Social DevCamp East provides the opportunity to bring an awareness of some of the really mind-bending and fun technology that people use in this area.  People that are looking for exciting opportunities on the East Coast can look right here and see who may be using technologies in the areas of cloud computing, distributed caching, distributed processing, data portability, and areas such as mobile computing and crowdsourcing.   

As you can tell, I am really excited about this event.  This is something that until recently I was unaware of until it was brought to my attention by someone with whom I work (thanks Bev!).  In fact, I am excited enough about it that I have just sent an e-mail to the organizers stating that Lotame would like to be one of the sponsors.  We want the same chance to see and hear the great things going on in this area and to mingle with smart people with great ideas.

September 20, 2008 Posted by | Social Media, Uncategorized | , , | Leave a Comment

Blogs and Wikis

I have done a moderate bit of blogging and wiki editing, although most of it has been behind the closed walls of employers.  I got to thinking about it though, when is the “right” time to use a blog or a wiki?  I have my own opinions based on that …

First let me say that I tend to think of both blogs and wikis as business and/or resource tools, I do not think of them as a way to communicate personal information – I do that other ways with my friends.  I may want the whole world to see what I think about a technical or political topic, but that is very different than letting you know where I live or who my family members are.  I know not everyone agrees with me in this area, but I think it is important to understand my stance on what is appropriate for public vs private consumption to understand my thoughts on when to use a blog vs. when to use a wiki.  Now that we have gotten that out of the way, let’s move on.

I think most people think that the decision is straight forward – use a blog when you are the sole owner of the material or thought, use a wiki when the content is shared.  Makes sense, but I don’t think it is the whole picture.  When I think of a blog I think of a way to communicate unrelated bits of temporal data.  Maybe news or a short burst of thoughts on a topic, or commentary.  That’s not to say there isn’t or shouldn’t be some connectedness between blog posts either within a blog or across blogs.  The connectedness is useful and good, but not the same connectedness of a wiki.  I see a wiki as trying to tell a story or provide direction that is not temporal.

But what does that mean, temporal vs. not temporal?  What I mean by that is that a blog is a one time shot – I am telling you something as it is or as I see it then and there at that point in time.  I don’t believe a wiki presents itself in the same way.  A wiki is made up of inter-related content that, while created over time, should be able to withstand the test of time (i.e. it should be updated to reflect current status, not subject to a point in time).  A wiki can be constantly edited and corrected, and should be.

But doesn’t that really take me back to the first point about who “owns” the content of the blog post vs the wiki entry?  Not really.  I can’t argue with that point, I believe it to be true.  But what if you are the sole owner of the content of both a blog and a wiki?   Why would you do that, you ask?  (i.e., why would you have a wiki only you could edit)?  Because of the type of content, style and connectedness of the content, and the expectation of the reader.

Let’s use an example.  Here I am blogging about wikis and blogs.  It is a short (ok, not so short) opinion that I have right now.  It certainly isn’t any statement of fact, it is an opinion … and my opinion could change over time.  Since I don’t want to rewrite history I might actually make another blog post over time that states I have changed my opinion.  This is just the kind of thing that I would expect to read in a blog, but would never, ever read in a wiki.  On the flip side, I also have a wiki (shameless self promotion, visit  http://peragro.wikidot.com).  I have two main things on that wiki about SOA and SOA Governance (I plan to put more there soon).  Those topics connect to each other, they have sections that would make them way too long for a blog, and as the state of development and SOA changes, so should the content in that wiki.  What I said in the past is no longer relevant, only what people need to know to develop a good architecture at the time they are reading it with the most up to date information.

I am not sure if this post will actually help anyone work out any issues they have in trying to decide on over the other, and the fact of the matter is that the technology is converging anyway such that soon enough there won’t really be a difference between the two, just a difference in the types of content you find spread across a bliki (Blog + Wiki).  Maybe my real reason for doing this is to justify the various places I have squirreled my content away.  Anyway, you’ve read this far, go ahead and check out my wiki.

July 31, 2008 Posted by | Blogging and Blogs | , | Leave a Comment

iPhone and Blogging in WordPress

This post is coming to the site courtesy of an iPhone 3G ….

So, what do I think of the usability of posting via an iPhone? Overall the experience is fairly positive considering that typing on a small keyboard definitely slows things down. I won’t bore you with yet another review of an iPhone since there are already so many out there. What I will tell you is that I do recommend the WordPress application from the iTunes application library. It isn’t flashy, but it does do the job of letting you post your thoughts when you don’t have access via a larger device … Maybe while you are sitting in the airport or taking a long ride in the car (notice I said riding, not driving!)

My advice? Try the iPhone app, take it for a spin, but do you hard core blogging on something with a little more finger real estate.

I can’t resist saying at least one other thing about the iPhone… I do very much like the 3G, although we don’t actually get those speeds where I live. Everything has been great, except MS Exchange integration. One of the big reasons I waited to get the 3G was to be able to get mail pushed from our corporate servers, and so far no dice. I was able to connect to some other Exchange services outside of our company, but so far we have not been able to find the secret ingedient to getting our mail server to work.

That leads me to leave you with a final thought on the various “answers” that people have out there to try and make this work:

If you are guessing, just say you are guessing and don’t fake being an authority ok the subject …. And if you really don’t know, just stay silent. All the garbage out there that is clearly not right just makes it take that much longer to find the real answers.

July 30, 2008 Posted by | Blogging and Blogs, iPhone | , , , , | Leave a Comment

Agile vs. Agile

“Agile Development” has been a buzz term for quite a while, but people still manage to use it incorrectly.  When somebody comes to talk to you or your team and says “I think we need to be more agile”, it is always important to ask them what they really mean and make sure that they actually do mean “we” and not “you”.  Why does that matter?  What’s the difference?  Agile is good, right?

Well, not necessarily.  First off, lets define what Agile Development is.  If you ask any six pundits, you would probably get six different answers, but in general I don’t think anyone would disagree with:

Agile development is an iterative and collaborative approach to software development that allows teams to respond to changing demands of the business.

I am intentionally leaving it simple and leaving out things like how much process is needed / not needed, and what types of deliverables are required to avoid arguments and stick to the basics.  Anyway, that makes sense, who could disagree with that?

Plenty of people.  First off, let’s just state that agile development isn’t the right fit for every situation and not worry about why it might or might not be (that’s a topic that has been covered many times already).  The issue here is discerning what someone means when they say “we need to be more agile”.

There have been many times when the “uninitiated” project manager or company leader makes this statement, and what they really mean is that “you need to be able to respond to my every whim”.  O….kay.  That is one definition of agile, I suppose.  This manager hears agile and thinks, gosh, I really could get everything I want exactly when I want it, agile is about the team being able to respond to my every request.  Don’t laugh, there are plenty of people that think this, and why not, it actually has been the state of development for a long time.

This interpretation of agile only provides for one set of changing behaviors – that of the development team.  Agile behavior goes way beyond the development team, it needs to create change in an all-inclusive team that is composed of not only the development team, but those very managers and other stakeholders such as QA, Operations, and customers (internal and external).  A real agile development process touches everyone.  Agile is about everyone sharing the pain and the gain all together.  Agile is not about a couple of developers responding to issues whenever they arise, it is about the entire group understanding what is important and each taking a stake in the success of the delivery of whatever the end results may be.  Agile is about creating behaviors that lead to trust across an organization through predictability.

When everyone works together and takes responsibility for prioritizing what needs to get done and continuously communicates not just progress, but all apects of the work being done, the end result is more predictable and there are not any big surprises.  Predictability and trust are important – those are the cornerstones that help information flow throughout the collective group and provide the building blocks for everyone to not only appear successful, but actually be successful.

So, when someone comes to you to talk about agility, make sure you know what they mean, and if they don’t quite get the true meaning, gently give them a push in the right direction.

July 30, 2008 Posted by | Development | , | 3 Comments

Innovation

[ This particular post is just reused material from one of my other blogs at Blogspot that I thought was worth reporting here. ]

I was reading an article about innovation on ZDNET a number of months back about innovation. It was the standard argument about what innovation is … is it revolutionary or creation of something from scratch? The article had two different points of view that it presented as it related to the topic. (Unfortunately, I can’t find the link to the article now.)

I think that both of them made some interesting points, but that they failed to put together what I think is a holistic view of the situation. Innovation need not always be the creation of something from scratch and need not necessarily be revolutionary … Innovation can also be using existing parts in a new way. What would be truly innovative would be to create two entirely different applications or services using 90% of the same pieces; pieces that already existed from other projects in the company or as parts of open source efforts.

If you sit back and really analyze most applications you will find that a significant portion of the effort involved with creating an application is the same effort that goes into every other application – security, data source access, presentation, transformation, reporting, etc. While each application may use these services slightly differently, the general idea is primarily the same. Think of how efficient an organization could become (and agile) if it concentrated on the 15% to 25% that truly makes applications different from one another. That would truly be innovative.

July 30, 2008 Posted by | Development | , | Leave a Comment

   

Follow

Get every new post delivered to your Inbox.

Join 258 other followers