The Fedora 28 release has been announced. “The headline feature for Fedora 28 Server is the inclusion of the new Modular repository. This lets you select between different versions of software like NodeJS or Django, so you can chose the stack you need for your software.” Some users will also appreciate that proprietary blobs (such as the NVIDIA drivers) are now easier to obtain and install.
Hadoop User Experience (Hue) is an open-source, web-based, graphical user interface for use with Amazon EMR and Apache Hadoop. The Hue database stores things like users, groups, authorization permissions, Apache Hive queries, Apache Oozie workflows, and so on.
There might come a time when you want to migrate your Hue database to a new EMR cluster. For example, you might want to upgrade from an older version of the Amazon EMR AMI (Amazon Machine Image), but your Hue application and its database have had a lot of customization.You can avoid re-creating these user entities and retain query/workflow histories in Hue by migrating the existing Hue database, or remote database in Amazon RDS, to a new cluster.
By default, Hue user information and query histories are stored in a local MySQL database on the EMR cluster’s master node. However, you can create one or more Hue-enabled clusters using a configuration stored in Amazon S3 and a remote MySQL database in Amazon RDS. This allows you to preserve user information and query history that Hue creates without keeping your Amazon EMR cluster running.
This post describes the step-by-step process for migrating the Hue database from an existing EMR cluster.
Note: Amazon EMR supports different Hue versions across different AMI releases. Keep in mind the compatibility of Hue versions between the old and new clusters in this migration activity. Currently, Hue 3.x.x versions are not compatible with Hue 4.x.x versions, and therefore a migration between these two Hue versions might create issues. In addition, Hue 3.10.0 is not backward compatible with its previous 3.x.x versions.
Before you begin
First, let’s create a new testUser in Hue on an existing EMR cluster, as shown following:
You will use these credentials later to log in to Hue on the new EMR cluster and validate whether you have successfully migrated the Hue database.
Let’s get started!
Migration how-to
Follow these steps to migrate your database to a new EMR cluster and then validate the migration process.
1.) Make a backup of the existing Hue database.
Use SSH to connect to the master node of the old cluster, as shown following (if you are using Linux/Unix/macOS), and dump the Hue database to a JSON file.
Edit the hue-mysql.json output file by removing all JSON objects that have useradmin.userprofile in the model field, and save the file. For example, remove the objects as shown following:
2.) Store the hue-mysql.json file on persistent storage like Amazon S3.
You can copy the file from the old EMR cluster to Amazon S3 using the AWS CLI or Secure Copy (SCP) client. For example, the following uses the AWS CLI:
b.) Connect to the Hue database—either the local MySQL database or the remote database in Amazon RDS for your cluster as shown following, using the mysql client.
$ mysql -h HOST –u USER –pPASSWORD
For a local MySQL database, you can find the hostname, user name, and password for connecting to the database in the /etc/hue/conf/hue.ini file on the master node.
[[database]]
engine = mysql
name = huedb
case_insensitive_collation = utf8_unicode_ci
test_charset = utf8
test_collation = utf8_bin
host = ip-172-31-37-133.us-west-2.compute.internal
user = hue
test_name = test_huedb
password = QdWbL3Ai6GcBqk26
port = 3306
Based on the preceding example configuration, the sample command is as follows. (Replace the host, user, and password details based on your EMR cluster settings.)
$ mysql -h ip-172-31-37-133.us-west-2.compute.internal -u hue -pQdWbL3Ai6GcBqk26
c.) Drop the existing Hue database with the name huedb from the MySQL server.
mysql> DROP DATABASE IF EXISTS huedb;
d.) Create a new empty database with the same name huedb.
mysql> CREATE DATABASE huedb DEFAULT CHARACTER SET utf8 DEFAULT COLLATE=utf8_bin;
i.) In MySQL, add the foreign key content_type_id back to the auth_permission
mysql> use huedb;
mysql> ALTER TABLE huedb.auth_permission ADD FOREIGN KEY (`content_type_id`) REFERENCES `django_content_type` (`id`);
j.) Start the Hue service again.
$ sudo start hue
hue start/running, process XXXX
That’s it! Now, verify whether you can successfully access the Hue UI, and sign in using your existing testUser credentials.
After a successful sign in to Hue on the new EMR cluster, you should see a similar Hue homepage as shown following with testUser as the user signed in:
Conclusion
You have now learned how to migrate an existing Hue database to a new Amazon EMR cluster and validate the migration process. If you have any similar Amazon EMR administration topics that you want to see covered in a future post, please let us know in the comments below.
Anvesh Ragi is a Big Data Support Engineer with Amazon Web Services. He works closely with AWS customers to provide them architectural and engineering assistance for their data processing workflows. In his free time, he enjoys traveling and going for hikes.
Security updates have been issued by Debian (firefox-esr, irssi, and librelp), Gentoo (busybox and plib), Mageia (exempi and jupyter-notebook), openSUSE (clamav, dhcp, nginx, python-Django, python3-Django, and thunderbird), Oracle (slf4j), Red Hat (slf4j), Scientific Linux (slf4j), Slackware (firefox), SUSE (librelp), and Ubuntu (screen-resolution-extra).
Security updates have been issued by Arch Linux (samba), CentOS (389-ds-base, kernel, libreoffice, mailman, and qemu-kvm), Debian (curl, libvirt, and mbedtls), Fedora (advancecomp, ceph, firefox, libldb, postgresql, python-django, and samba), Mageia (clamav, memcached, php, python-django, and zsh), openSUSE (adminer, firefox, java-1_7_0-openjdk, java-1_8_0-openjdk, and postgresql94), Oracle (kernel and libreoffice), Red Hat (erlang, firefox, flash-plugin, and java-1.7.1-ibm), Scientific Linux (389-ds-base, kernel, libreoffice, and qemu-kvm), SUSE (xen), and Ubuntu (curl, firefox, linux, linux-raspi2, and linux-hwe).
Security updates have been issued by CentOS (389-ds-base, dhcp, kernel, libreoffice, php, quagga, and ruby), Debian (ming, util-linux, vips, and zsh), Fedora (community-mysql, php, ruby, and transmission), Gentoo (newsbeuter), Mageia (libraw and mbedtls), openSUSE (php7 and python-Django), Red Hat (MRG Realtime 2.5), and SUSE (kernel).
Security updates have been issued by Debian (isc-dhcp and python-django), Gentoo (go and util-linux), Mageia (389-ds-base, dovecot, and tor), openSUSE (python-Django), Oracle (389-ds-base, kernel, libreoffice, and php), Scientific Linux (389-ds-base, kernel, libreoffice, and php), and Ubuntu (clamav and libreoffice).
Security updates have been issued by Debian (jackson-databind, leptonlib, libvorbis, python-crypto, and xen), Fedora (apache-commons-email, ca-certificates, libreoffice, libxml2, mujs, p7zip, python-django, sox, and torbrowser-launcher), openSUSE (libreoffice), SUSE (libreoffice), and Ubuntu (advancecomp, erlang, and freetype).
Security updates have been issued by Debian (django-anymail, libtasn1-6, and postgresql-9.1), Fedora (w3m), Mageia (389-ds-base, gcc, libtasn1, and p7zip), openSUSE (flatpak, ImageMagick, libjpeg-turbo, libsndfile, mariadb, plasma5-workspace, pound, and spice-vdagent), Oracle (kernel), Red Hat (flash-plugin), SUSE (docker, docker-runc, containerd, golang-github-docker-libnetwork and kernel), and Ubuntu (libvirt, miniupnpc, and QEMU).
Security updates have been issued by Debian (mpv), Fedora (jackson-databind), Mageia (flash-player-plugin), Slackware (kernel), and Ubuntu (python-django).
Version 2.0 of the Django web framework has been released. This version drops support for Python 2.x, and adds a long list of new features; see the announcement for details.
Security updates have been issued by Debian (graphicsmagick, libdatetime-timezone-perl, openjpeg2, thunderbird, and tzdata), Fedora (curl, glusterfs, java-1.8.0-openjdk, lame, lucene, SDL2, systemd, and xen), Red Hat (python-django), and Ubuntu (linux-lts-trusty and quagga).
Security updates have been issued by Arch Linux (flashplugin, kernel, lib32-flashplugin, and linux-lts), CentOS (postgresql), Debian (tcpdump and wordpress-shibboleth), Fedora (lightdm, python-django, and tomcat), Mageia (flash-player-plugin and libsndfile), openSUSE (chromium, cvs, kernel, and libreoffice), Oracle (postgresql), and Ubuntu (libgcrypt20 and thunderbird).
Security updates have been issued by Arch Linux (irssi), CentOS (httpd and kernel), Debian (nginx), Fedora (perl-DBD-MySQL and qt5-qtwebengine), Mageia (apache-mod_fcgid, cairo, jbig2dec, nodejs, and sudo), openSUSE (libreoffice, spice, and systemd), Red Hat (python-django-horizon), and SUSE (kernel and xorg-x11-server).
Security updates have been issued by Debian (freetype, jasper, python-django, slurm-llnl, and weechat), Fedora (dovecot and pcre2), Gentoo (adobe-flash), openSUSE (curl, gstreamer-plugins-base, libsndfile, and tiff), and Ubuntu (mysql-5.5, mysql-5.7).
Security updates have been issued by Arch Linux (firefox and weechat), Debian (chicken, firefox-esr, libcroco, libreoffice, and tiff), Fedora (backintime, bind, firefox, libarchive, libnl3, pcre2, php-pear-CAS, and python-django), Mageia (icu and proftpd), openSUSE (mozilla-nss and wireshark), Red Hat (java-1.6.0-sun, java-1.7.0-oracle, and java-1.8.0-oracle), Scientific Linux (firefox and java-1.8.0-openjdk), Slackware (mozilla, ntp, and proftpd), and Ubuntu (firefox).
Security updates have been issued by Debian (libosip2, openoffice.org-dictionaries, and qbittorrent), Fedora (kernel, libpng12, libsndfile, libtiff, mediawiki, mupdf, qt5-qtwebengine, samba, xen, xorgxrdp, and xrdp), Mageia (mediawiki, ming, python-django, unshield, and webkit2), and openSUSE (postgresql93).
Security updates have been issued by Arch Linux (mediawiki, python-django, and python2-django), Debian (jasper, libdatetime-timezone-perl, logback, ming, potrace, and tzdata), Fedora (curl, ghostscript, icecat, and xen), openSUSE (apparmor), and Slackware (libtiff).
Security updates have been issued by Debian (python-django), Fedora (firebird), openSUSE (pidgin and ruby2.2, ruby2.3), Red Hat (v8), Scientific Linux (bash, coreutils, curl, glibc, gnutls, kernel, libguestfs, ocaml, openssh, qemu-kvm, quagga, samba, samba4, subscription-manager, and wireshark), and Ubuntu (lightdm, linux-hwe, linux-lts-trusty, linux-lts-xenial, linux-ti-omap4, and python-django).
The collective thoughts of the interwebz
By continuing to use the site, you agree to the use of cookies. more information
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.