Migration to a new mac : macOs needs to repair your Library to run application

5atl0Migrating a mac may not be your weapon of choice. I’d rather copy the files I’m interested on manually, or put both old and new computers in the same network and rsync whatever I need.

But life is not fair. We can’t do whatever we want, or not on this reality. So I was asked for help migrating one “old” MacBook (~ 2015) to a new one (~2018) with touch bar. How did I do that? First I got the very rare USB-c to Thunderbolt adapter, then connect both macs and open the migration assistant on both. The whole procedure is described in this apple support post, so I’m not going to write it again.  I’m going to make some comments.

I migrated only user accounts and Applications. Migrating all will cause the new mac to have the same name and credentials than the old one, and I don’t want that on  my network. Migration proceeded without issues, but when I tried to log in with the user usertwo account of the “old” mac, I found the error above. To fix it, I logged in as an administrator of the new compute, (a not migrated account,  and do this:

cd /Users/usertwo
ls -alh 
sudo chown -R usertwo ./*

The solution is at the end of this macrumors post. I’m so pissed off the procedure didn’t even copy the users ID…what’s so complicated on that? 😀

Advertisements

Syncthing install on CentOS 7.5

screenshot-720I will start by quoting the product. What is syncthing? “Syncthing is an application that lets you synchronize your files across multiple devices. This means the creation, modification or deletion of files on one machine will automatically be replicated to your other devices.” This says it all. Next question is: What for? This can vary: I’m going to say here because it’s  multi-platform: there are apps, a web interface, a GUI, and so on, and all of if for free. Unfortunately installing it on CentOS is not for newbies. Let’s start.

Step one: create a yum repository. There’s an entry abut syncthing on the centos forum. It means to create a special repository for syncthing. What I did is I copied an already existing repository, rename it, and edit it.

cd /etc/yum.repos.d/
cp epel.repo syncthing.repo
gedit syncthing.repo

Inside the edited repo file, we copy this. Then yum clean all, yum update. Or even better, reboot if you can. At the end

yum install syncthing
systemctl stop firewalld
/bin/syncthing

And if you have the browser open, the Syncthing web UI as above will open. Now what? We go to the syncthing configuration, and we edit it so that it has my CentOS client IP, not the default one.  We may want to create a service for the process, but I’m not going to tell you how to do that.

I test that I can access to the web UI from another computer, and I can. Then I install the syncthing android app (that runs on the same network than my syncthing web server) and add the device on the web interface. It’s not very intuitive: to add the device you get a QR code or a very long set of letters and numbers.  Anyway, once I add it, I see on the web UI that syncthing wants to add one of the folders of my phone to the “Folders” section. I click “it’s OK” and the sync begins. Once you are done, you have the typical options: Pause, Rescan, Edit…

I must say the final sensation is very good, so I approve it. The problem will be, as usual, to propagate and promote its usage. We’ll see how it goes!

CryoSPARC not starting after update to v2.8 on CentOS 7.X : bad timing interval

As usual, click here if you want to know what is cryosparc. I have created a cryosparc master-client setup. In principle I did update from v2.5 to v.2.8 successfully after running on a shell cryosparc update. It’s the standard procedure. I got updated all, master and clients. But after the update I rebooted everything. And after the reboot of the master node the problems started. This is the symptom:

cryosparcm start
Starting cryoSPARC System master process..
CryoSPARC is not already running.
database: started
command_core: started

And the starting hangs there. The message telling you  where to go to access to your server is not appearing. Of course I waited. The status looks like this:

cryosparcm status
--------------------------------------------------
CryoSPARC System master node installed at
/XXX/cryosparc2_master
Current cryoSPARC version: v2.8.0
----------------------------------------------
cryosparcm process status:
command_core                     STARTING 
command_proxy                    STOPPED   Not started
command_vis                      STOPPED   Not started
database                         RUNNING   pid 49777, uptime XX
watchdog_dev                     STOPPED   Not started
webapp                           STOPPED   Not started
webapp_dev                       STOPPED   Not started
------------------------------------------------
global config variables:
export CRYOSPARC_LICENSE_ID="XXX"
export CRYOSPARC_MASTER_HOSTNAME="master"
export CRYOSPARC_DB_PATH="/XXX/cryosparc_database"
export CRYOSPARC_BASE_PORT=39000
export CRYOSPARC_DEVELOP=false
export CRYOSPARC_INSECURE=false

It looks like in this cryosparc forum post. Unfortunately no solution is given there. We can check what the log webapp is telling also:

 cryosparcm log webapp
    at listenInCluster (net.js:1392:12)
    at doListen (net.js:1501:7)
    at _combinedTickCallback (XXX/next_tick.js:141:11)
    at process._tickDomainCallback (XXX/next_tick.js:218:9)
cryoSPARC v2
Ready to serve GridFS
events.js:183
      throw er; // Unhandled 'error' event
      ^
Error: listen EADDRINUSE 0.0.0.0:39000
    at Object._errnoException (util.js:1022:11)
    at _exceptionWithHostPort (util.js:1044:20)
    at Server.setupListenHandle [as _listen2] (net.js:1351:14)
    at listenInCluster (net.js:1392:12)
    at doListen (net.js:1501:7)
    at _combinedTickCallback (XXX/next_tick.js:141:11)
    at process._tickDomainCallback (XXX/next_tick.js:218:9)

It looks like a java problem (EADDRINUSE stands for address in use). So which java process is creating the listening error?

I clean up as suggested on this cryosparc post,  or on this one, deleting the /tmp/ and trying to find and kill any supervisord rogue process. That I don’t have. Next I reboot the master but the problem persists. Messing up with the MongoDB does not help also. What now? The cryosparc update installed a new python, so I decide to force the reinstall of the dependencies. It is done like this:

cryosparcm forcedeps
  Checking dependencies... 
  Forcing dependencies to be reinstalled...
  --------------------------------------------------
  Installing anaconda python...
  --------------------------------------------------
..bla bla bla...
 Forcing reinstall for dependency mongodb...
  --------------------------------------------------
  mongodb 3.4.10 installation successful.
  --------------------------------------------------
  Completed.
  Completed dependency check. 

If I believe what the software tells me, everything is fine. I reboot and run cryosparcm start but my “command core” still hangs on STARTING. After several hours of investigation, I decide to take a drastic solution. Install everything again. Then I find it.

 ./install.sh --license $LICENSE_ID \
--hostname sparc-master.org \
--dbpath /my-cs-database/cryosparc_database \
--port 39000
ping: bad timing interval
Error: Could not ping sparc-master.org

What is this bad timing interval? I access to my servers via SSH + VPN, so it could be that the installer can’t handle the I/O of such a load, or the time servers we use, or something. Or maybe is that the Java versions differ? In any case, I approach to it on another way. I need to be closer. How to?

I open a virtual desktop there and in it, I call an ubuntu shell where I run my installer. Et voila! bad timing gone. And the install goes on without any further issues. Note that I do a new install using the previous database (–dbpath /my-cs-database/cryosparc_database so that everything, even my users, are the same than before 🙂

Long story short: shells may look the same but behave differently. Be warned!

Perl to Python, shell to perl, python to C : about code converters

First you need to have the need to convert the code. Why to convert a piece of code from one language to another? I going to name a few reasons:

  • Familiarity. Let’s say you are just a lamer, and yiu know by heart only python, C, or FORTRAN, and you get your code on another language you are not fully fluent. You can run a converter, then check the output on the language you control.
  • Integrability. The algorithm, the function, or whatever it is, needs to come together with other pieces, written on that “other” language. Although of course it it possible to have some kind of suite written in several languages, everything is more readable and beautiful if it’s under a common grammar.
  • Portability. A lot of operative systems have shells, or something very similar or compatible. We can’t say the same of python and perl, although if you are a good programmer you could install the interpreter you need beforehand. Like if you need an specific python to run your script.
  • Speed. Speed? Yes, speed. The same compiled code for simulation running on C++ may take 10 times less running as a FORTRAN compilation. I don’t have the numbers for python versus R, but definitely, some solutions are better than others.

I say convert, not translate, since what I want is the functionality. I got a piece of perl code of unknown value that I plan to use from a bash shell. As a first step, I want to translate it. So I google about it. I found this sh2p code. It does the opposite of what I want (shell to perl) but let’s install it. To do so,

# > perl Makefile.PL 
Checking if your kit is complete...
Looks good
Writing Makefile for App::sh2p
Writing MYMETA.yml and MYMETA.json

Now we make it

# > make
cp lib/App/sh2p/Builtins.pm blib/lib/App/sh2p/Builtins.pm
...some more here
cp bin/sh2p.pl blib/script/sh2p.pl
/usr/bin/perl -MExtUtils::MY -e 
'MY->fixin(shift)' -- blib/script/sh2p.pl
Manifying blib/man3/App::sh2p::Builtins.3pm
Manifying blib/man3/App::sh2p::Handlers.3pm
Manifying blib/man3/App::sh2p.3pm
Manifying blib/man3/App::sh2p::Trap.3pm

And they ask us to run a test also like this:

# > make test

PERL_DL_NONLAZY=1 /usr/bin/perl 
"-MExtUtils::Command::MM" "-e" 
"test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/App-sh2p.t .. ok 
All tests successful.
Files=1, Tests=10, 0 wallclock secs 
( 0.03 usr 0.01 sys + 0.06 cusr 0.01 csys = 0.11 CPU)
Result: PASS

Finally we install it:

make install
Installing /usr/local/share/perl5/App/sh2p.pod
Installing /usr/local/share/perl5/App/sh2p/Operators.pm
Installing /usr/local/share/perl5/App/sh2p/Here.pm
Installing /usr/local/share/perl5/App/sh2p/Statement.pm
Installing /usr/local/share/perl5/App/sh2p/Trap.pm
Installing /usr/local/share/perl5/App/sh2p/Builtins.pm
Installing /usr/local/share/perl5/App/sh2p/Handlers.pm
Installing /usr/local/share/perl5/App/sh2p/Parser.pm
Installing /usr/local/share/perl5/App/sh2p/Utils.pm
Installing /usr/local/share/perl5/App/sh2p/Compound.pm
Installing /usr/local/share/man/man3/App::sh2p::Handlers.3pm
Installing /usr/local/share/man/man3/App::sh2p::Trap.3pm
Installing /usr/local/share/man/man3/App::sh2p::Builtins.3pm
Installing /usr/local/share/man/man3/App::sh2p.3pm
Installing /usr/local/bin/sh2p.pl
Appending installation info to /usr/lib64/perl5/perllocal.pod

MY test run (on a CentOS 7 client):

/usr/local/bin/sh2p.pl bind.sh bind.pl
Processing bind.pl:
# **** INSPECT: sleep replaced by Perl built-in sleep
# Check arguments and return value

And everything seems to be correct. Nice! We have a working shell to perl translator. How about the other way around? I didn’t find anything, but there is one perl to python translator on this github repo. I clone it, download it, whatever, and I run it over the perl script I just created (bind.pl) , but the results are meaningless.

Let’s check more translations. How about making an executable with pp? No, it doesn’t seem to work. But this web here seems to do the trick. Even to C,C++ and with incomplete parts. I can now cut and copy what I want into my new project! And…that’s it for today, have a nice weekend!

An ASCII version of Star Wars Episode IV

2018-10-13-image-4

This you need to check out! Open a command prompt, and type:

 telnet towel.blinkenlights.nl

Note that you need, of course, to have telnet. Found here while looking for windows command prompt shortcuts. The full article is very interesting, so don’t go directly to the end of it. Note 2: it works also on linux, if you have telnet installed. Have fun! 🙂

Slurm 18.08 with QOS mariadb problems on CentOS 7

I already told you how to install Slurm on CentOS 7 so I’m not going to repeat it for a  modern slurm package. I’m going to comment on the new issues I had using the procedure. Problem one: making rpms.

rpmbuild -ta slurm-15.08.9.tar.bz2

This I solved by using a variation of this solution. I just did it as root.

yum install 'perl(ExtUtils::Embed)' 'rubygem(minitest)'

You could also configure, make and make install the source code. Once done, I run a script to copy my slurm rpms or my slurm source code to the local machine, clean up the previous installation (deleting the packages and the munge and slurm users and folders) and install everything (munge + slurm).

Problem two: the slurm database configuration. I’m going to start from a working installation of 18.08. That means you can submit jobs, they run and so on. First time I did a modification on it I screwed up the queuing system: all jobs got stucked with status CG. The solution to stucked CG jobs is scancel followed by.

scontrol update NodeName=$node State=down Reason=hung_proc
scontrol update NodeName=$node State=resume

Of course it is normal to commit mistakes if you play around. On Sloppy Linux Notes they have a very short guide about how to install a mariadb with slurm. Please try out the above method before go on reading, this one is a sad story 😦

So I had it already installed on my database client, but I was not using it. Instead of removing all the little bits and pieces, I tried to reset the mariadb root password. Note that you may want to recover the mysqld password instead. In any case, this is the error:

root@node ~ ## > mysql -u root -p
Enter password: 
ERROR 1045 (28000): Access denied for user 'root'@'localhost' 
(using password: YES)

Even with the right password. Depending on your install, skip grant tables may work, in my case, I get this

MariaDB [(none)]> ALTER USER 'root'@'localhost' 
IDENTIFIED BY 'NewPass';
ERROR 1064 (42000): You have an error in your SQL syntax; 
check the manual that corresponds to your MariaDB server version 
for the right syntax to use near 'USER 'root'@'localhost' 
IDENTIFIED BY 'NewPass'

I check the documentation as suggested, but I still don’t manage. Even some potsts about the problem on a mac. I tried generating a password hash…but without luck. This works:

MariaDB [(none)]> SET PASSWORD FOR 'root'@'localhost' 
= PASSWORD('NewPass'); 
Query OK, 0 rows affected (0.00 sec)

But I can’t login as root after flushing the privileges and removing the skip-grant-tables from my.cnf. On the DigitalOcean they advice to alter user also, but instead of modifying the my.cnf, they suggest to start the database skipping the grant tables

mysqld_safe --skip-grant-tables --skip-networking &

My mariadb version is 5.5

root@node > rpm -qa | grep mariadb 
mariadb-server-5.5.60-1.el7_5.x86_64
mariadb-devel-5.5.60-1.el7_5.x86_64
mariadb-libs-5.5.60-1.el7_5.x86_64
mariadb-5.5.60-1.el7_5.x86_64

So:

MariaDB[(none)]> SET PASSWORD FOR 'root'@'localhost' 
= PASSWORD('NewPass');

Now I can log in as root with my new password. What’s next? Yes, we need to setup the mariadb slurm user and the slurm tables.

MariaDB [(none)]> CREATE USER 'slurm'@'node' 
IDENTIFIED BY 'SLURMPW';
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> create database slurm_acct_db;
Query OK, 1 row affected (0.00 sec)
MariaDB [(none)]> GRANT ALL PRIVILEGES ON 
`slurm_acct_db`.* TO 'slurm'@'node' with grant option;
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> grant all on slurm_acct_db.* TO 'slurm'@'node'
-> identified by 'SLURMPW' with grant option;
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> flush privileges;
Query OK, 0 rows affected (0.00 sec)

Here you have how to add a user to mariadb with all privileges in case you need more info. And the documentation on GRANT. And all in a nutshell with a script. If you have problems with the database (for example it is corrupted)

root@node> more /var/log/mariadb/mariadb.log
XXX [ERROR] Native table 'performance_schema'.'rwlock_instances' 
has the wrong structure

you may want to DROP it or rebuild all the databases.

root@node ~ ## > mysql_upgrade -uroot -p --force
Enter password: 
MySQL upgrade detected
Phase 1/4: Fixing views from mysql
Phase 2/4: Fixing table and database names
Phase 3/4: Checking and upgrading tables
Processing databases

After such an action, it may be interesting to get a list of mariadb users and rights.  Or show your grants:

MariaDB [(none)]> show grants

But let’s don’t look back and go ahead. If after all this troubles you didn’t give up and you have a mariadb running, it’s time to configure the slurmdbd daemon. Our slurmdbd.conf should look like this:

/etc/slurm/slurmdbd.conf
AuthType=auth/munge
DbdAddr=localhost
DbdHost=localhost
SlurmUser=slurm
DebugLevel=4
LogFile=/var/log/slurm/slurmdbd.log
PidFile=/var/run/slurmdbd.pid
StorageType=accounting_storage/mysql
StorageHost=node
StoragePass=SLURMPW
StorageUser=slurm
StorageLoc=slurm_acct_db

We can start the daemon now…and here comes the section for slurmdbd errors.

Error:  ConditionPathExists=/etc/slurm/slurmdbd.conf was not met
Solution: Check the file exist, has that name and it is accessible by ‘node’.

Error:  This host (‘node’) is not a valid controller
Solution: Check your slurm.conf, where it is defined the controller in ‘ControlMachine’

Error:  mysql_real_connect failed: 2003 Can’t connect to MySQL server on ‘node’
Solution: Check StorageHost=XXX on your slurmdbd.conf. and AccountingStorageHost=XXX on slurm.conf Change it for an IP instead of name.

Error:  mysql_real_connect failed: 1045 Access denied for user ‘slurm’@’node’ (using password: YES)
Solution: Check that you can log in as ‘slurm’ with SLURMPW on myslq. If not, you need to create a user that is able to do that.

Error:  Couldn’t load specified plugin name for accounting_storage/mysql: Plugin init() callback failed
Solution: Check that your mariadb is up and running. Check that you have the accounting_storage.so. You may need to recompile everything…

Error:  It looks like the storage has gone away trying to reconnect
Solution: Check that the cluster is seen by the accounting system. If not, you need to add it using an account manager command

root@node ## > sacctmgr add cluster MYCLUSTER

We need to set QOS also. To do so, maybe we need to use the consumable resource allocation plugin select/cons_res, that is to say, tell slurm to manage the CPUs, RAM, and GPUs.  Add to your slurm.conf something like this:

SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory

There are a lot of examples of the slurm documentation on cons_res. Be aware that there is cons_res bug on hardened systems if you compile slurm hardening. Let’s define some QOS as in the documentation.

sacctmgr add qos zebra

And see how they look like:

sacctmgr show qos format=name,priority
Name       Priority 
---------- ---------- 
normal     0 
zebra      0 
elephant   0

Now everything should be fine. We check:

root@node ## > slurmctld -Dvvv

If you need it, here you have the QOS at the biocluster. And the official documentation on slurm accounting. And I’m pretty tired of fixing things, distributing files, and look at logs. I hope you didn’t need it at all. At the end, just to finish this collection of troubles, the slurm problems page from SCSC. Happy slurming…

Giving root power to a CentOS 7 user

This is an old one. I was explicitly avoiding to pass through this hole, but the time has come. There is the need to run a script that will copy data owned by ROOT from storage A to storage B.  We don’t want to change the permissions or data ownership, neither we want to run it on a crontab. Solution: allow the normal user to run the script as root. It is not so complicated if you know how to do it.

We have tested the script, and it runs fine as ROOT. I will place the script on /home/admin/bin/myscript.sh, that is accessible only for root. What the script does is irrelevant for the post. It could be a simple copy or rsync. In my case, it checks that the folder is properly named, that the data is not currently being transferred, and that the data folder is not existing already. Once we are happy with the script, we simply type visudo as root on the computer of choice for the data transfer task. We will see a file filled with explanations that is physically placed on /etc/sudoers. IMPORTANT: you need to edit it with visudo or your changes will not work!

Let’s say we want to let alpha and beta users run myscript.sh. Both are AD users, by the way. However, we give them access only from one machine, where we open visudo. We can edit the file like we do with vi, pressing i (from insert) and wq to write and quit. At the end of the sudoers file we add

alpha ALL= NOPASSWD: /home/admin/bin/myscript.sh
beta ALL= NOPASSWD: /home/admin/bin/myscript.sh

We save the file and test that it works as it does as root. Obviosuly this is not the most effective way if we want a lot of people to run our script, but in principle, we don’t want a lot of people moving data around. Or do we?