Saturday, December 4, 2010

automatic document classification with Alfresco Part 2

In the first part of this article, i explained how you can use Lucene to query a document (Word, PDF etc...), and find matches for specific keywords, which was necessary for us in order to automatically identify the document's category based on its content.

We've chosen a simple approach to demonstrate the automatic classification extension : if a document contains the name of a category, then it belongs to it, of course we can use other approaches like assigning multiple keywords to a category, example : if a document contains one of the following words "java, .Net, c#..." then assign it to category "Software development", it can easily be implemented once you finish reading and understanding this article, and of course how you implement it depends on your specific needs, you might need some more advanced classification algorithm.

Tuesday, November 30, 2010

Alfresco automatic document classification : Part 1

Alfresco is capable of handling multiple classifications, or hierarchies of classification, it's a very useful feature, and can make your life a lot easier when looking for documents, especially the ones with no indexed content like pictures, scanned documents etc...
Classifying a document in Alfresco can be as easy as few clicks on the browser, however it can be very time-consuming process if you are uploading many documents every day, or if you are migrating to Alfresco : Imagine having to manually classify a few thousands of documents!
If you are still classifying documents manually, analyzing their content, and sorting them into categories, you might be interested in finding out how you can extend Alfresco to automatically classify your documents for you.

Friday, November 26, 2010

Getting started with Alfresco

During my internship at TGR, one of the project's requirements was indexing and managing documents, and that was my first experience with Alfresco,  which is an open source Enterprise Content Management (ECM), it combines a collection of content-centric technologies like Document Management (DM), Records Management (RM), and other technologies that should make your life, if your field of work is Content Management, easier.

Customizing Alfresco can be a bit tricky, and hard to grasp at first, but after doing it few times, you start to get the feel of it, and this post, as well as the future articles about Alfresco, are all here for you to get there.

Tuesday, November 23, 2010

IOS 4.2 released

Since yesterday, 18h, the latest release of IOS which is 4.2 is officially available for downloads.
New features :
  • AirPrint: print email, photos, web pages, and documents right from  iPhone, iPad, or iPod touch.
  • AirPlay: stream digital media wirelessly from your iPhone, iPad, and iPod touch to your Apple TV and AirPlay-enabled speakers.
  • Find My IPhone: helps you locate your missing device and protect its data — is now free on any iPhone 4, iPad, or fourth-generation iPod touch running iOS 4.2.

Friday, November 19, 2010

smart Card tutorial : Part 2

In my previous article smart Card tutorial : Part 1, i explained how you can use the Cryptographic Token Interface Standard PKCS#11 in order to access your smart card or USB Token, and perform some simple operations like listing the connected tokens, and displaying each device's informations (Serial Number, Vendor etc...).
This second part will cover some advanced usage of PKCS#11 like :
  • Generating on-card RSA Keypair.
  • Storing certificates and objects.

smart Card tutorial : Part 1

When your application handles or performs sensitive data or operations, security becomes a major concern, which is why some companies decides to use Strong Authentication for their applications, also known as 'two factors authentication' : one thing that you have (the smart Card, or USB token), and one that you know (the PIN or password).

This article will be the start of a series of tutorials about smart cards (USB Tokens are readerless smartcards, so the same applies) : how to access, manage, generate keys etc... at the end of the articles, tou should be able to write smart-card-based applications for authentication, digital signatures and PIN/PUK management.

Java JNDI Tutoriel : Ldap query

Our topic of discussion today will be JNDI (Java Naming and Directory Interface), which is a standard way to query directories for different kind of available informations.
We will see how you can use Java to interact with OpenLdap or any other directory (like Apache OpenDS etc...).
In order to connect to an ldap directory server we will need to know :
  • The server url (in our case localhost)
  • The authentication used : Simple, Digest etc...
  • the base DN, or the root element.
  • Bind DN, the user we will connect to the directory as.
  • Bind password.

EJBCA : a Step By Step install Guide

As mentionned in my previous Article Presenting EJBCA, this article will explain how to have an In-House Certificate Authority up and running using EJBCA.
EJBCA needs a couple of componenets to be installed in order for it to work, we will be using :
  • EJBCA 3.9.5
  • Mysql Server 5
  • Apache Ant 1.7
  • OpenJDK 6 JDK
  • JCE (Java Cryptographique Extension) 6
  • JBoss Application Server 4.4
  • MySQL Java Connector 5

    Presenting EJBCA

    Your enterprise decided recently to start using digital Certificates for authentication, digital signatures, Server authentication etc... in other words, you need a Certificate Authority(CA) in order to deliver certificates for your servers and users.

    Of course, you can choose to buy certificates from known vendors like VerSign,Thawte etc... but you will need to pay for every certificate a user or a server needs! a better approach would be hosting you own In-House certificate Authority, this way you can issue as many certificates as you need.
    Now that you decided to host your own CA,have your own Public Key Infrastructure (PKI),

    A new Software Engineering Blog

    Hi and welcome everyone,

    My name is Haltout Sohaib, a Moroccan Software Engineer, software building, engineering and security auditing always been my passion, although being a student right now, i did work as freelancer a couple of times, especially security related jobs.

    My friends, and people i worked with, or helped, have always recommended me to start a blog, but never really had time to for various reasons, but decided finally to start one, because i believe that if we are where we stand today, it's all thanks to sharing knowledge, it's all about sharing!

    So, starting today, i will write and share tricks, best practices, tutoriels, guides etc... on various topics like :
    • Security.
    • Software Engineering .
    • Public Key Infrastructure.
    • Enterprise Content Management (ECM), especially Alfresco.
    • J2ee, JEE5
    • Spring Security, MVC ....
    And other frameworks, technologies i worked with.

    If you need help with a project you're working on, or have a question feel free to contact me at : haltout.sohaib@gmail.com .

    I'm currently transferring messages from my old blog hsohaib.com to this one.