Sunday, September 29, 2013

fun with solr - day1: installation and simple example

I had urge of learning something new.
Per dice.com, there is a high demand of Solr skill. Solr is high performance NoSQL search platform/server built using Lucene Core, with XML/HTTP and JSON/Python/Ruby APIs, hit highlighting, faceted search, caching, replication, and a web admin interface.

Watched Yonik Seeley's presentation and Solr 4. The NoSQL Database on YouTube and was very fascinated by that.
List of sites who is already using is pretty impressive too and I just saw serious potential to our new app conversion.

So I decided to learn Solr as my weekend project!

before you have fun with Solr, you want to make sure java is installed.

to check, you just run: $ java -version
on my macbook, it was already installed
$ java -version
java version "1.6.0_51"
Java(TM) SE Runtime Environment (build 1.6.0_51-b11-457-11M4509)
Java HotSpot(TM) 64-Bit Server VM (build 20.51-b01-457, mixed mode)

but not on my EC2 instance so I installed with apt-get
$ sudo apt-get install openjdk-7-jre-headless

Install & Run Solr with Jetty

What is Jetty: Jetty is a pure Java-based HTTP server and Java Servlet container.
For playing purpose, single Solr instance with small Jetty version that already included in solr package is good enough. So let's get it started.

# download binary and unzeip.
$ curl -O http://apache.mirrors.hoobly.com/lucene/solr/4.4.0/solr-4.4.0.tgz
$ tar -xvf solr-4.4.0.tgz
# clean up
$ rm solr-4.4.0.tgz
$ cd solr-4.4.0/example
$ java -jar start.jar

and point your browser to: http://localhost:8983/solr/
Wow that was easy and Amazing interface!




if you are also setting up on EC2 you want to check following setup so that your server can serve tomcat via public ip
(1) go to Elastic IP to associate your instance
(2) go to Security Group > Inbound and make sure 8983 is open



Add and Retrieve document (example from Yonik's presentation)

naoko-macbook:~ nreeves$ curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d'
 [
 {"id": "book1",
  "title": "American Gods",
  "author": "Nail Gaiman"
 }
 ]
 '
{"responseHeader":{"status":0,"QTime":64}}

naoko-macbook:~ nreeves$ curl http://localhost:8983/solr/get?id=book1
{
  "doc":
  {
    "id":"book1",
    "title":["American Gods"],
    "author":"Nail Gaiman",
    "_version_":1447566597403181056}}

How cool.



Saturday, September 28, 2013

python: advanced sort with sorted and lambda

I was working on Django project. I had need of order by QuerySet in somewhat complex way.
Basically, I am listing family but list "Spouse" relationship at the top.

Say I have data structure like this:

class Family:
    def __init__(self, type, name):
        self.type = type
        self.name = name

    def __repr__(self):
            return repr((self.type, self.name))

fam = [
    Family("child", "Joi"),
    Family("child", "Leona"),
    Family("parent", "Toshiko"),
    Family("spouse", "Coy"),
    Family("parent", "Akira"),
]


In SQL statement (PostgreSQL) I can express like this.

SELECT * FROM family WHERE ......

ORDER BY CASE WHEN type='Spouse' THEN 'aaa' else type END

Easy enough.
Now I wasn't sure how to do this in Django.
With raw sql or extras filter code becomes pgSQL specific solution so I wanted to avoid if possible.
So I decided to go with pure python function "sorted" since set of data I will be dealing with is small (just family members... so usually like 4 or 5 and at most... 10?)

Here is how I was able to accomplish.

>>> print(fam)
[('child', 'Joi'), ('child', 'Leona'), ('parent', 'Toshiko'), ('spouse', 'Coy'), ('parent', 'Akira')]

>>> family = sorted(fam, key=lambda x: ("aaa" if x.type == "spouse" else x.type)) 
>>> print(family)
[('spouse', 'Coy'), ('child', 'Joi'), ('child', 'Leona'), ('parent', 'Akira'), ('parent', 'Toshiko')] 


I don't probably need to explain what lambda or sorted with key param is as documentation and thousands of other posts are doing great job but I wanted to post this because I could not find example that shows this level of "key" for sorted function.
"key" will be return ('aaa') for Spouse and (<type>) for other relationship.

for x in family:
    key=lambda x: ("aaa" if x.type == "spouse" else x.type)
    print("%(name)s has type of %(type)s" % {"name": x.name, "type":key(x)})


>>> Coy has type of aaa
>>> Joi has type of child
>>> Leona has type of child
>>> Akira has type of parent
>>> Toshiko has type of parent


Ah-ha!

If you want to further specify sort order, you can add sort by name followed by type like this too.

family = sorted(fam, key=lambda x: ("aaa" if x.type == "spouse" else x.type, x.name))


Hope this magic became clear to you as well.

Sunday, September 22, 2013

python mock: Diffference between Mock and MagicMock

note: why use mock/fake/stub object for unit test:

Using mock objects allows developers to:
  1. write and unit-test functionality in one area without actually calling complex underlying or collaborating classes
  2. focus their tests on the behavior of the system under test (SUT) without worrying about its dependencies
When mock objects are replaced by real ones then the end-to-end functionality will need further testing. These will be integration tests rather than unit tests.

Mock and MagicMock objects create all attributes and methods as you access them and store details of how they have been used.

Mock allows you to assign functions (or other Mock instances) to magic methods and they will be called appropriately. The MagicMock class is just a Mock variant that has all of the magic methods pre-created for you (well, all the useful ones anyway) - (per http://www.voidspace.org.uk/python/mock/#quick-guide)

example:

>>> import mock
>>> mock1 = mock.Mock()
>>> mock2 = mock.MagicMock()

>>> mock2.__str__.return_value = 'foobarbaz'
>>> str(mock2)
'foobarbaz'
>>> mock2.__str__.assert_called_with()

>>> mock1.__str__.return_value = 'foobarbaz'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'method-wrapper' object has only read-only attributes (assign to .return_value)

# ah, there it is, i have to configure __str___ behavior

>>> mock1.__str__ = mock.Mock(return_value='wheeeeee')
>>> str(mock1)
'wheeeeee'

SonarQube to analyze python code on mac

SonarQube is an open platform to manage code quality. As such, it covers the 7 axes of code quality:
- architecture & design
- duplications
- unit tests
- complexity
- potential bugs
- coding rules
- comments


This is the minimum configured way to install SonarQube and start analyzing your python code on your mac (I run 10.8.5 but I think this should work on older os x as well)
Estimated time of completion: 10 min


  1. Install SonarQube (server to host result)
    1. go to http://www.sonarqube.org/downloads/ and click "Download" link and unarchived (I downloaded sonar-3.7)
    2. move unarchived directory to wherever you want to place (I created dir called sonar. so far, i have sonar/sonar-3.7/ and I will reference this dir as SONARQUBE_HOME) 
    3. cd SONARQUBE_HOME/bin/macosx-universal-64
    4. to start server: $ sh sonar.sh start
      1. at this point, if you start server (), you should be able to see interface via browser @ http://localhost:9000/
  2. Install SonarQube Runner (server to host result) 
    1. Download from here (at the time of writing 2.2 is the latest version) and unarchived it
    2. move unarchived directory to wherever you want to place (I moved under sonar I created at step 1. I now have sonar/sonar-runner-2.3/ and I will reference this dir as SONAR_RUNNER_HOME)
    3. Edit SONAR_RUNNER_HOME/conf/sonar-project.properties
      • sonar.projectKey=my:project
      • sonar.projectName=project
      • sonar.projectVersion=1.0
      • sonar.projectDescription=project description
      • sonar.source=path-to-project-dir
      • sonar.language=py
  3.  Install python plugin
    1. Download from here (at the time of writing 1.1 is the latest)
    2. copy downloaded jar file to SONARQUBE_HOME/extensions/plugins/
  4. Restart the SonarQube server
  5. Run SonarQube Runner to analyze
    1. note: I was getting "Java heap space" error. In my case, runner needed 269m. You can solve this by increasing limit and run as follows (example shows to set it to 1g but you can do 512m etc as well): SONAR_RUNNER_OPTS=-Xmx1024m bin/sonar-runner
    2. now you can review your result @ http://localhost:9000/