As part of the daily update, there was a change in the sessions at 10.50am where a new session TS-5027 on Minion Search Engine was added. Curious about how they handle business data and searching while the content is being streamed, I decided to attend this session. The presentation went into good set of details on how to use the APIs defined as part of Minion engine could be used. How the stats are created and how the results surface up depending on matches. The search engine does scan through music files based on the tags specified for the same. But unfortunately, this search engine requires the document/content to exist in the file system.
My next session was TS 6237 Debugging Data races. The topic interested me since we keep running into exceptions that intermittently occur, occur on one system and not on other. The session was pretty detailed in showing how race conditions could occur especially on multi-core processors. Some notes are captured below.
With the same curiosity and many answers un-answered, I went for my next session TS 6316 Transactional memory in Java Tech based systems. This was even more interesting since Intel and its partners were looking to implement transactionality at the core level by using certain constructs. More notes below in corresponding section.
My next visit was the AMD general session. The session was not really exiting apart from claiming that they were working closely with Sun on bettering Java etc.
To close the day I decided to go for GlassFish V3 and JMaki Lab. To my luck there were slots available for me to do the lab. The reason to attend this session was to get a feel of both technologies GF and JMaki. The exercise details including instructions are available on line. Pls refer below for the link.
--- TS 5579 Closures Cookbook -----------------------------------------------
Closures cookbook: Niel Gafter from Google
BGGA Review
Runnable r = {=> doWhatever(); }; //the method executed when run is executed below
r.run();
Callable
when params are involved
Comparator
param and statement involved
sing closures in Swing
ItemSelectable is - ...
is.addItemListern(
{ItemEvent e => doSomethig (e, is); }) //alows u to use local params
- Creates an object that represents - code of body, lexical context
- An instance of some interface
- Few restrictions - may access local, this, may return enclosing method
Aggregate Operations:-
- Input is an agg: array, list, map etc.
- Bulk computation over data
- Part of computation caller-defined
- Can often be "automatically" parellelized
- Sort, filter, map, reduce, fold etc.
//seq code - what is highest GPA amongst students graduating
double highestGpa = -Double.MAX_VALUE;
for (Student s : students) {
if (s.graduationYear == THIS_YEAR) {
int gpa = s.getGpa();
if (highestGpa gpa) highestGpa = gpa;
}
}
Have to read whole thing to understand
//aggregate code
final doul highestGpa =
students
.filter({Student s => s.graduationYear == THIS_YEAR})
.map({ Student s => s.getGpa() })
.max();
An agg code instead of using a List. Concurrently the code can be executed especially on multi-processors. Enables management of concurrency in lib code instead of letting JVM manage it.
JSR 166y.
An API
/* record perf statstics of an operation */
public void recordTiming {
String opName,
long nanosec,
boolean succeeded) {...}
client code:
long startTime = System.nanoTime()l
boolean success = false;
try {
//a series of stmt to be timed
success = true;
} finally {
recordTiming {
"opName", System.nanoTime() - startTIme, success);
}
Problem: Appears 110 of times n clutters in codebase
One aspect to overcome above is Aspect OP. Lots of use cases of AOP and closures overlap.
Move boilerplate into API?
class Trace {
private final long startTime = System.nanoTime()l
private final String opName;
private boolean success = false;
public Trace(String opName) {
this.opName = opName;
}
public success() {
success = true;
}
public void done() {
record...
}
}
client:
Tracer tracer = new Tracer("opName");
try {
tracersuccess();
} finally {
tracer.done();
}
Little better and not successfull. Major drawback, what happens if return is required in themethod
try {
if () {
tracker.success();
return result;
}
tracker.success();
return result1;
}
Scope for forgetting resulting in wrong output
- In 7 java, there is an option to use multiple exceptions in one catch block
catch (Myxception | myxception 2) {
tracker.dofailure();
} finally {
tracker.done();
}
In 7 there is a final where the exception caught is rethrown by doing some logic
} catch (final Throwable ex) {
tracker.faulure();
throw ex; //throws same exceptions
}
How to solve above problem using closures:
Closure-based API
interface Block{
void execute();
}
public void time(String opName, Block block) {
long startTIme = System.nanoTime();
boolean success = fase;
try {
block.execute();
} finally {
recordTiming....);
}
}
client:
public void time(String opName, Block block)...
time("opName", {=>
//a series of stmts to be timed
});
Exceptions:
When timing code throws exceptions - above interface Block does not throw any exception
How to
interface Block
void execute() throws X;
}
public
String opName, Block
Would like to have any no of exceptions
Exception Transparency:
interface Block
void execute() throws X;
}
public
String opName, BlockM
long startTime..
boolean sucess false
try {
block.execute();
success = true;
....
Completion Transparency:
to time pdy of method
int f() {
return ..compute result..;
}
int f() {
time ("opName
}
Note that the success=true MAY not be executed above.
refactor above as:
boolean success = true;
try {
block.exeucte();
} catc (final Throeable ex) {
success = fasle;
throw ex;
} finally {...}
Completion
int f() {
time {"op", {=>
return ....compute...; //return result from f
});
)
A blck that can not complete normally
interface NothingBlock
Nothing execute() throws X;
}
public
String opName, NothingBlock
long st = SYstem...
boolean success = true;
try {
return block.execute();
} catch (final Throable ex) {
success = false;
throw ex;
} finally {
recordTiming (....)
}
}
}
if op completes successfully use time(...) else time2(...)
A block that can complete normally
interface VoidBlock
Void execute() throws X;
}
public
String p[, VoidBlock
...
try {
return block.execute();
}
.....
}
}
Combine using Generics:
interface Block
R execute() throws X;
}
public
String op, Block
....
client code:
int f() throws MyException {
time("op", {=> //executed on success
//
});
time ("op" {=> //on failure
//
});
}
Simplified Syntax:
some boiler plate code below such as => etc..
time{"op", {=>
...;
});
can be avoided with below syntax
time ("opName") {
....;
}
- Closing stream is boiler plate code when working with streams
withStream(InputStreams: makeInputStream()) {
//do
}
interface ClosableBlock
R invoke(C c) throws X;
}
R withStream (C c, CloseableBlock
try {
return block.invoke(c);
} finally {
c.close();
}
}
above block can be replaced with below where function types are used instead of wildcards
R withStream(C c, {C ==> R throws X} block) throws X, IOExcepion {
.....
- Loop over entries in map
Use BGGA closures can do the looping
Map
for eachEntry(Key k, Value v : map) {
//do
}
void for eachEntry {
Map
{K, V ==> void throws X } block) throws X {
for (Map.Entry
llop through
- In concurrency packages
Lock lock = ...;
withLock (lock {
//do
}
R withLock (Lock lock, { ==> R throws X} block) throws X
......
-----------------------------------------------------------------------------
--- TS-5027 Minion Search Engine --------------------------------------------
Minion Search Engine
get searchEngine by specifying an index dir where index is to be stored.
define fields using config file or API
index documents
pulic void index(SimpleIndexer si, MimeMessage m String foldername) {
si.setID(m.getMessageID());
si.setField(m.getFrom());
...
}
Querying:
Use ResultSet.
Query operators
- Term - case exact, morph, stem, wildcard.
- Proximity near, within, phrase, passage
- parameterci : =, <
- boolean
Available Grammars:
- Full: contains, note, and
- web
- Lucenesque style
Doc vectors;
Provides wights on the no of time the word is found in the doc.
Finding similar docs
- can get Doc vector and find similar docs
- search based on a field eg: subject
- search based on weighted fields
Document trainer:
Given set of docs with a label can tran the engine how to apply the label to new docs and classify them.
Clustering docs:
-
Tag based documents:
http://minion.dev.java.net
TS-5841
blogs.sun.com/searchguy
One major part I was looking for to search over a stream is not supported.
- How to apply for batch of documents? How to identify the separator betwen each doc
- Only on clear text data?
- Can I train the doc to contain a key word at a particular index
-----------------------------------------------------------------------------
--- TS 6237 Debugging Data races -----------------------------------------------
The content here is not comprehensive since it contained very fine details that I could not capture.
Debugging races:
- Hashmap - with one write multiple readers. Reader could read 3/4 insert.
- Similarly List
Debuggin:
- Visual Inspection: But user may not know the complete flow etc..
- Printing:
- make noise at each read/write of shared variable
- inspect trace after crash
- Per thread pringing of System nanoTime: but a heavy weight technique.
- FindBugs
- scales to prd use
- pattern matching
-----------------------------------------------------------------------------
--- TS 6316 Transactional memory in Java Tech based systems -----------------
Java Transactional Memory: Suresh Srinivas, Vyacheslav Shakin from Intel
- TM is a tool for concurrency control in Multi-core systems
- STM is a soft tech for developing new parellel software.
- HTM is a hardware tech that both existing and new software can utlizie to improve perf as well as concurrency.
-
static void move (Queue q1, Queue q2) {
synchronized(q1) {
synchronized(q2) {
Object o = q1.dequeue();
q2.enqueue(o);
}
}
}
client code:
cls.move(q1, q2)
cls.move(q2, q1)
Possible problem to above is Thread1 acquires lock on 'q1' while Thread 2 acquires lock on 'q2'. Deadlock occurs
To overcome this the block can be declared in 'atomic' block which when stmt 1 acquires it commits and in between if stmt 2 tries to commit any of its changes exception is thrown. Basically Transaction implemenation at lowest level.
Implementations:
Atomic
McRT (Multi Core runtime)
- java locks are optimized for:
- No Contention case
- not optimzed for case of lock contention with no real data conflicts.
Hardware Transaction Memory:
- java system utilizes HW transactional Memory ("isolation" & "atomicity") with no changes to software
- Software lock Elision
- Speculative Execution
PPoPP 2008 paper
ISCA 2007 paper - on speculative elusion.
References:
whatif.intel.com
java HTM is work in progress
stamp.stanford.edu
http://cs.rochester.edu/
-----------------------------------------------------------------------------
--- LAB 4520LT GlassFish V3 and building JSF using JMaki --------------------
Check out the link at http://developers.sun.com/learning/javaoneonline/j1lab.jsp?lab=LAB-4520LT&yr=2008&track=1 for details on instructions in doing this lab.
-----------------------------------------------------------------------------
No comments:
Post a Comment