CryoSPARC not starting after update to v2.8 on CentOS 7.X : bad timing interval

As usual, click here if you want to know what is cryosparc. I have created a cryosparc master-client setup. In principle I did update from v2.5 to v.2.8 successfully after running on a shell cryosparc update. It’s the standard procedure. I got updated all, master and clients. But after the update I rebooted everything. And after the reboot of the master node the problems started. This is the symptom:

cryosparcm start
Starting cryoSPARC System master process..
CryoSPARC is not already running.
database: started
command_core: started

And the starting hangs there. The message telling you  where to go to access to your server is not appearing. Of course I waited. The status looks like this:

cryosparcm status
--------------------------------------------------
CryoSPARC System master node installed at
/XXX/cryosparc2_master
Current cryoSPARC version: v2.8.0
----------------------------------------------
cryosparcm process status:
command_core                     STARTING 
command_proxy                    STOPPED   Not started
command_vis                      STOPPED   Not started
database                         RUNNING   pid 49777, uptime XX
watchdog_dev                     STOPPED   Not started
webapp                           STOPPED   Not started
webapp_dev                       STOPPED   Not started
------------------------------------------------
global config variables:
export CRYOSPARC_LICENSE_ID="XXX"
export CRYOSPARC_MASTER_HOSTNAME="master"
export CRYOSPARC_DB_PATH="/XXX/cryosparc_database"
export CRYOSPARC_BASE_PORT=39000
export CRYOSPARC_DEVELOP=false
export CRYOSPARC_INSECURE=false

It looks like in this cryosparc forum post. Unfortunately no solution is given there. We can check what the log webapp is telling also:

 cryosparcm log webapp
    at listenInCluster (net.js:1392:12)
    at doListen (net.js:1501:7)
    at _combinedTickCallback (XXX/next_tick.js:141:11)
    at process._tickDomainCallback (XXX/next_tick.js:218:9)
cryoSPARC v2
Ready to serve GridFS
events.js:183
      throw er; // Unhandled 'error' event
      ^
Error: listen EADDRINUSE 0.0.0.0:39000
    at Object._errnoException (util.js:1022:11)
    at _exceptionWithHostPort (util.js:1044:20)
    at Server.setupListenHandle [as _listen2] (net.js:1351:14)
    at listenInCluster (net.js:1392:12)
    at doListen (net.js:1501:7)
    at _combinedTickCallback (XXX/next_tick.js:141:11)
    at process._tickDomainCallback (XXX/next_tick.js:218:9)

It looks like a java problem (EADDRINUSE stands for address in use). So which java process is creating the listening error?

I clean up as suggested on this cryosparc post,  or on this one, deleting the /tmp/ and trying to find and kill any supervisord rogue process. That I don’t have. Next I reboot the master but the problem persists. Messing up with the MongoDB does not help also. What now? The cryosparc update installed a new python, so I decide to force the reinstall of the dependencies. It is done like this:

cryosparcm forcedeps
  Checking dependencies... 
  Forcing dependencies to be reinstalled...
  --------------------------------------------------
  Installing anaconda python...
  --------------------------------------------------
..bla bla bla...
 Forcing reinstall for dependency mongodb...
  --------------------------------------------------
  mongodb 3.4.10 installation successful.
  --------------------------------------------------
  Completed.
  Completed dependency check. 

If I believe what the software tells me, everything is fine. I reboot and run cryosparcm start but my “command core” still hangs on STARTING. After several hours of investigation, I decide to take a drastic solution. Install everything again. Then I find it.

 ./install.sh --license $LICENSE_ID \
--hostname sparc-master.org \
--dbpath /my-cs-database/cryosparc_database \
--port 39000
ping: bad timing interval
Error: Could not ping sparc-master.org

What is this bad timing interval? I access to my servers via SSH + VPN, so it could be that the installer can’t handle the I/O of such a load, or the time servers we use, or something. Or maybe is that the Java versions differ? In any case, I approach to it on another way. I need to be closer. How to?

I open a virtual desktop there and in it, I call an ubuntu shell where I run my installer. Et voila! bad timing gone. And the install goes on without any further issues. Note that I do a new install using the previous database (–dbpath /my-cs-database/cryosparc_database so that everything, even my users, are the same than before 🙂

Long story short: shells may look the same but behave differently. Be warned!

Advertisements

Perl to Python, shell to perl, python to C : about code converters

First you need to have the need to convert the code. Why to convert a piece of code from one language to another? I going to name a few reasons:

  • Familiarity. Let’s say you are just a lamer, and yiu know by heart only python, C, or FORTRAN, and you get your code on another language you are not fully fluent. You can run a converter, then check the output on the language you control.
  • Integrability. The algorithm, the function, or whatever it is, needs to come together with other pieces, written on that “other” language. Although of course it it possible to have some kind of suite written in several languages, everything is more readable and beautiful if it’s under a common grammar.
  • Portability. A lot of operative systems have shells, or something very similar or compatible. We can’t say the same of python and perl, although if you are a good programmer you could install the interpreter you need beforehand. Like if you need an specific python to run your script.
  • Speed. Speed? Yes, speed. The same compiled code for simulation running on C++ may take 10 times less running as a FORTRAN compilation. I don’t have the numbers for python versus R, but definitely, some solutions are better than others.

I say convert, not translate, since what I want is the functionality. I got a piece of perl code of unknown value that I plan to use from a bash shell. As a first step, I want to translate it. So I google about it. I found this sh2p code. It does the opposite of what I want (shell to perl) but let’s install it. To do so,

# > perl Makefile.PL 
Checking if your kit is complete...
Looks good
Writing Makefile for App::sh2p
Writing MYMETA.yml and MYMETA.json

Now we make it

# > make
cp lib/App/sh2p/Builtins.pm blib/lib/App/sh2p/Builtins.pm
...some more here
cp bin/sh2p.pl blib/script/sh2p.pl
/usr/bin/perl -MExtUtils::MY -e 
'MY->fixin(shift)' -- blib/script/sh2p.pl
Manifying blib/man3/App::sh2p::Builtins.3pm
Manifying blib/man3/App::sh2p::Handlers.3pm
Manifying blib/man3/App::sh2p.3pm
Manifying blib/man3/App::sh2p::Trap.3pm

And they ask us to run a test also like this:

# > make test

PERL_DL_NONLAZY=1 /usr/bin/perl 
"-MExtUtils::Command::MM" "-e" 
"test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/App-sh2p.t .. ok 
All tests successful.
Files=1, Tests=10, 0 wallclock secs 
( 0.03 usr 0.01 sys + 0.06 cusr 0.01 csys = 0.11 CPU)
Result: PASS

Finally we install it:

make install
Installing /usr/local/share/perl5/App/sh2p.pod
Installing /usr/local/share/perl5/App/sh2p/Operators.pm
Installing /usr/local/share/perl5/App/sh2p/Here.pm
Installing /usr/local/share/perl5/App/sh2p/Statement.pm
Installing /usr/local/share/perl5/App/sh2p/Trap.pm
Installing /usr/local/share/perl5/App/sh2p/Builtins.pm
Installing /usr/local/share/perl5/App/sh2p/Handlers.pm
Installing /usr/local/share/perl5/App/sh2p/Parser.pm
Installing /usr/local/share/perl5/App/sh2p/Utils.pm
Installing /usr/local/share/perl5/App/sh2p/Compound.pm
Installing /usr/local/share/man/man3/App::sh2p::Handlers.3pm
Installing /usr/local/share/man/man3/App::sh2p::Trap.3pm
Installing /usr/local/share/man/man3/App::sh2p::Builtins.3pm
Installing /usr/local/share/man/man3/App::sh2p.3pm
Installing /usr/local/bin/sh2p.pl
Appending installation info to /usr/lib64/perl5/perllocal.pod

MY test run (on a CentOS 7 client):

/usr/local/bin/sh2p.pl bind.sh bind.pl
Processing bind.pl:
# **** INSPECT: sleep replaced by Perl built-in sleep
# Check arguments and return value

And everything seems to be correct. Nice! We have a working shell to perl translator. How about the other way around? I didn’t find anything, but there is one perl to python translator on this github repo. I clone it, download it, whatever, and I run it over the perl script I just created (bind.pl) , but the results are meaningless.

Let’s check more translations. How about making an executable with pp? No, it doesn’t seem to work. But this web here seems to do the trick. Even to C,C++ and with incomplete parts. I can now cut and copy what I want into my new project! And…that’s it for today, have a nice weekend!

Monitoring network traffic on Windows

I’m now trying to get a global number of bandwidth usage. The reason is quite complex, and it’s out of the scope of the post, but basically, I want to say we use 15 GB/s in total on the average, or something like that. Also I want to plot it, to see how the network demand is evolving. For that I need to record on a file CSV values like date,value. Let’s have a look on how to get that on Windows.

The natural network tool is netstat. Here you have a guide on netstat. We can use netstat for getting a lot of interesting information, like who’s logged in. We want to use it to get the Interface statistics. Try it out:

netstat -e

The given numbers are not human-readable. At least not for me. We wrap the command on a windows batch script and do some calculations with its output, so that we end up with the numbers I want.  The solution given here seems to work, but only for a while. I left the modified version of the solution offered running, but after sometime the stdout on my cmd shell was no more refreshed. Since I don’t want to debug it and I don’t need to use batch, I move on and search for a python network bandwith monitor. That solution kind of work. My modified version is here:

import time
import psutil
import sys
from time import gmtime, strftime
sys.stdout = open('running.csv', 'w')

def main():
   old_value = 0
   while True:
   new_value = psutil.net_io_counters().bytes_sent + 
               psutil.net_io_counters().bytes_recv
         if old_value:
               send_stat(new_value - old_value)
         old_value = new_value
         time.sleep(10), sys.stdout.flush()

def convert_to_gbit(value):
    return str(round(value/1024./1024./1024.*8,4))
def send_stat(value):
    print strftime("%Y-%m-%d-%H-%M,", gmtime()),
           convert_to_gbit(value)
main()

As usual with python, be careful with the indentation.  Some comments about my modifications. Number one: I round the value.  You may need less decimals, or more. Number two: I print the current date on the format I like. Number three: I write on a file, that I flush on each while loop. And if you have problems with psutil, you can learn how to import psutil here. Basically,

C:\>Python27\python.exe -m pip install psutil

Don’t forget that you have of course other proffesional solutions, if you just want to monitor one computer. I don’t want to monitor only one, and I don’t want to install “alien” software on my “delicated” windows servers. That’s why I go for this. Unfortunately, life is complicated, what can I say 😦

Web Chart and Graph tools: Zing and CanvasJS

annual-sales-dashboardCoding is scary. So the more code I can steal, the less scared I feel. Unfortunately we need web plots, the more beautiful, the better. I wrote already quite some posts about data management, so you know all of this is about handling data. Sometimes from a database, sometimes from command line results. If you have a database, I heard that Zing seems to be the weapon of choice. Here you have a working example of loading MySQL data to create charts. Unfortunately Zing is, as you will read later, not my choice.

Since we speak about MySql, it may be interesting for debugging purposes to graph data from mysql database in python. On the previous link, you have a step-by-step tutorial that includes from the initial health checks over mysql to embedding the python plotly result on a web as an iframe. The results look very professional, but if you ask me, it’s a killer for my already structured data. What is structured data? For me, something like username:value, date:value or parameter:value, on a list, recorded or available as standard output. Something I can grep. It can be CSV, it can be separated by a space.

For that kind of data, my choice is CanvasJs, from where I took the above picture. It uses javascript to plot my values, and it has a lot of easy examples (cut-and-copy) of angular and bar charts, react charts with zooming and similar, and very practical dynamic charts. Play with it, since it’s for free. What I did with it is I created a chart from CSV with CanvasJS to test my data, then render multiple charts in a web page. That page can be html, or php code to read your csv. The choice is yours 🙂

 

Install and use of R in CentOS 7.6

I want to create plots from my slurm cluster. And I’ve decided to do in in a modern way, with R. Let’s go through the install of R on CentOS 7 first, the install of the R pacakges, and then let’s generate some plots.

 yum install R -y

After a lot of packages, I end up with my R prompt. It looks like this:

## > R

R version 3.5.3 (2019-03-11) -- "Great Truth"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Now we need to install the data table. This is done by simply typing on the R prompt. This is what happens:

> install.packages("data.table")
Installing package into ‘/usr/lib64/R/library’
(as ‘lib’ is unspecified)
--- Please select a CRAN mirror for use in this session ---

And I get a pop-up window with a list of servers. Nice! After the selection, the package is downloaded, compiled and installed. At the end, it looks like this:

* DONE (data.table)
Making 'packages.html' ... done

The downloaded source packages are in
‘/tmp/Rtmpl2VWpp/downloaded_packages’
Updating HTML index of packages in '.Library'
Making 'packages.html' ... done
>

I leave typing q() and I save the workspace image. All of this I do of course on the login node where the slurmdb daemon is running. I get now the slurm-stats scripts.

 git clone https://github.com/CSCfi/slurm-stats.git

And I do some test on the folder the R scripts lay. I generate a data file like this:

sacct --format User,Partition,Submit,Start > sisu

It is a variation of the minimal setup. I removed the parameters I don’t know what they do. I run the R script… that fails miserably.

Error in eval(bysub, parent.frame(), parent.frame()) : 
object 'Partition' not found
Calls: [ -> [.data.table -> eval -> eval
Execution halted

Let’s try with a little bit of care, that is, adding a start date.

sacct --format User,Partition,Submit,Start -P -a -S 04/19 > sisu
R --no-save --args "sisu" < sacct_stats_queue_dist.R

The output tells me this at the end.

> 
> write.csv(out,paste(filename,"_out.csv",sep=""),
row.names=FALSE, na="")
>

And I have my CSV (Comma Separated Values) file generated after the sacct. A CSV that I can get via a script, in a relatively easy way. Now time to tune this up. And plot it. And…gosh, but it’s late, I’m tired and I think I should leave here and check it out next time. So see you around!

 

SEO and big data over (my) WP data

Stats, oh stats. Sometimes you love them, sometimes you hate them. WordPress (WP) offers already lot of numbers you may want to crunch in a different way. The truth may disappoint you, or it can be completely inconclusive, but the tentation exists.

And I took it. I’m somehow happy with my stats at this stage. I don’t distribute my post over social media (no twitter no facebook) but I have readers worldwide, and a little here a little there make it up to around 300 visits a day now (2019) versus 300 visit a month in 2017, a year after I started this blog. Anyway, let’s do this.

If you want to start your own analysis class, you should define it. At building time, we could already suffer from P-hacking, since we know what we are looking for, but I will assume you don’t mind. As members of my data class  I take what I consider relevant, the year, the month, the day of the week, the local time, the number of posts per week, the category, and finally the number of visitors. Of course I’m expecting a growing number of visitors, although this is not necessarily assured. In fact, that’s why I’m writing this: my number of visitors stays stagnant since quite some time already. Let’s start by downloading my stats.  Got it? Now you have a CSV file, that we can use to start filling up our class. CSV “comma separated values” can be processed in any way, with ipython, excel, or C++. I chose python. Since I don’t want to teach you how to plot CSV data with python, neither I want to flood you with my plots, I will jump directly to my conclusions.

Most of my visits come searching for computing solutions. This means my dragons are not popular at all. Accumulated visits from some of my plotlines (like The Water Wedding) are not bad, but it is clear for me that 80% of the people come for my bits. I don’t know if to be unhappy about it, but since both blog lines (bits & dragons) are original and they have a purpose, I’m fine. And I will keep posting dragons 🙂

Next interesting finding is that there is no correlation between the number of plots per week and the number of visits. It means I can go on holidays, or post absolutely nothing for a week that the number of visits will not go down. I was not able to disentangle (or not yet due to the scarcity of the data) if the visitors during these postless weeks were coming for computing advice or for my dragons. Maybe I should spend more time working on my analysis routine.

I’m not so sure but it looks like the local time early posts are more popular than the late ones. That is, if I post before lunch, the post has more views than in the evening. There’s a clear correlation between the topics also. Before lunch I tend to post about computing. I don’t know if this is connecting with your local reader time. The country is not one of my class parameters, I for sure will include it next time.

At last but not at least, I found out weekends are always bad on the number of visitors. I do write some weekends, but only my dragons, that seem to be not the most popular option amongst the random visitors as I said before, so it makes sense.

I didn’t care about the number of likes and comments. It’s not that I don’t care personally, but I do know you need some commitment as a blogger to press like, or to write a comment. So these parameters are out of my conclusions. Also, I didn’t care about the number of categories or tags linked with every post, but I guess the bits are always having more. Having good tags are really the SEO part but I don’t write a blog to live from it, so let’s say I just want to raise the concern about them.

Anyway, it was fun to have a look on all of it. So what do you think about it? Do you observer the same phenomena? Or is it different for you? Don’t worry about not answering, it’s more a rhetorical question 😉

OSX ipython UnknownBackend %matplotlib unable to use

I’m continuing with my data science experiments. If you are also following some text instead of learning it by use, you may have encountered that you are unable to use matplotlib as suggested on the text.  The line

%matplotlib inline

Producers a long dump that ends up with

UnknownBackend: No event loop integration for 'inline'. 
Supported event loops are: qt, qt4, qt5, gtk, gtk2, gtk3, tk, 
wx, pyglet, glut, osx

You can eventually ignore the inline command and save your plot “plt” using savefig.

In [47]: plt.savefig('scatter.png')   

This will save your plot on the current folder where you run ipython as a png named ‘scatter’. But we don’t want to be saving and checking on each step. We want to see it first. The solution, as all the good solutions, is easy when you know it. Instead of:

In [41]: %matplotlib inline                       

You write:

In [42]: %matplotlib osx 
In [43]: import matplotlib.pyplot as plt                        
In [44]: import seaborn; seaborn.set()          
In [45]: plt.scatter(X[:, 0], X[:, 1]);  

Now your plots will display on a separated window. You’re welcomed 🙂