CAT-SOOP is a flexible, programmable learning management system based on the Python programming language. https://catsoop.mit.edu
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 

469 lines
18 KiB

  1. <python>
  2. cs_content_header = cs_long_name
  3. </python>
  4. This page describes a fairly typical CAT-SOOP setup, and assumes that you have
  5. already installed and configured CAT-SOOP as described on [this
  6. page](CURRENT/..).
  7. <tableofcontents/>
  8. <section>Check Web Settings</section>
  9. To start, double-check the following values in `config.py`:
  10. * `cs_url_root`, which is the URL of the root of the CAT-SOOP installation.
  11. * `cs_checker_websocket`, which tells CAT-SOOP where clients can make websocket connections to the checker.
  12. For example:
  13. ```
  14. cs_url_root = 'http://localhost:6010'
  15. cs_checker_websocket = 'ws://localhost:6011'
  16. ```
  17. Typically, on a public-facing server, `cs_url_root` will start with `https`,
  18. and `cs_checker_websocket` will start with `wss`.
  19. <div class="callout callout-warning">
  20. <h4>Double Check</h4>
  21. <p>Make sure that the <code>cs_fs_root</code> directory can be read from and written to by
  22. the web server's user.</p>
  23. <p>Make sure that the <code>cs_data_root</code> directory is <b>not</b> web-accessible, and that
  24. the web server's user has read/write access.</p>
  25. </div>
  26. By default, running `catsoop start` will start several processes. The most
  27. important are the UWSGI server (default port `6010`) and the websocket server
  28. (default port `6011`). You can change these ports by setting additional
  29. variables `cs_wsgi_server_port` and `cs_checker_server_port`, respectively, in
  30. your `config.py`.
  31. <section>Configure nginx</section>
  32. Next, we will configure nginx to redirect relevant traffic to the web server
  33. and the websocket server.
  34. Start by creating a new file in `/etc/nginx/sites-available` with the following
  35. content, which will configure nginx to route certain requests to CAT-SOOP, and
  36. to redirect all traffic to HTTPS.
  37. You can, of course, customize the endpoints (`/cat-soop` and `/reporter` in the
  38. example below) to change the base URL for both the WSGI server and the
  39. websocket server.
  40. <div class="callout callout-info">
  41. <h4>Note</h4>
  42. If you do not already have one and you are planning to make a public-facing
  43. server, you should acquire an SSL certificate. If your server is running at
  44. MIT, you can follow the instructions <a
  45. href="http://kb.mit.edu/confluence/display/istcontrib/Obtaining+an+SSL+certificate+for+a+web+server"
  46. target="_blank">on this page</a>. Otherwise, SSL/TLS Certificates are
  47. available gratis from <a href="https://letsencrypt.org/" target="_blank">Let's
  48. Encrypt</a>.
  49. </div>
  50. ```
  51. # redirect all HTTP traffic to HTTPS
  52. server {
  53. listen 80;
  54. listen [::]:80;
  55. return 301 https://$host$request_uri;
  56. }
  57. server {
  58. # listen on port 443 (standard port for HTTPS traffic) and enable SSL
  59. listen 443 ssl;
  60. listen [::]:443 ssl;
  61. # the following should reference your SSL cert and key file
  62. ssl_certificate /path/to/certificate-chain.crt;
  63. ssl_certificate_key /path/to/keyfile.key;
  64. # by default, serve files from /var/www/html
  65. root /var/www/html;
  66. # set the server's name (change this to reflect your server's FQDN)
  67. server_name your.server.com;
  68. # try adding trailing slashes before 404'ing
  69. location / {
  70. try_files $uri $uri/ =404;
  71. }
  72. # ignore .ht* files
  73. location ~ /\.ht {
  74. deny all;
  75. }
  76. # the following will route requests to https://your.server.com/cat-soop
  77. # to the uWSGI server. change "cat-soop" in the following lines if you
  78. # want to use a different URL.
  79. location /cat-soop {
  80. rewrite /cat-soop/?(.*) /$1 break;
  81. proxy_http_version 1.1;
  82. proxy_set_header Upgrade $http_upgrade;
  83. proxy_set_header Connection 'upgrade';
  84. proxy_set_header Host $host;
  85. proxy_set_header X-Real-IP $remote_addr;
  86. proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  87. proxy_cache_bypass $http_upgrade;
  88. proxy_pass http://localhost:6010/;
  89. }
  90. # the following will route websocket requests to
  91. # wss://your.server.com/reporter to CAT-SOOP's websocket server.
  92. location /reporter {
  93. proxy_http_version 1.1;
  94. proxy_set_header Upgrade $http_upgrade;
  95. proxy_set_header Connection 'upgrade';
  96. proxy_set_header Host $host;
  97. proxy_set_header X-Real-IP $remote_addr;
  98. proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  99. proxy_cache_bypass $http_upgrade;
  100. proxy_pass http://localhost:6011/;
  101. }
  102. }
  103. ```
  104. <div class="callout callout-warning">
  105. <h4>Double Check</h4>
  106. <p>Note that your <code>cs_url_root</code> and <code>cs_checker_websocket</code> should match the nginx configuration. In the example above, we should have <code>cs_url_root = 'https://your.server.com/cat-soop'</code> and <code>cs_checker_websocket = wss://your.server.com/reporter</code> in the <code>config.py</code> file.</p> <p>
  107. Also, make sure the ports (<code>6010</code> and <code>6011</code> in the example above) match the port numbers you set in <code>config.py</code>, if any.
  108. </p>
  109. </div>
  110. <div class="callout callout-info">
  111. <h4>Note</h4>
  112. If you want the root of the webserver to point to the CAT-SOOP instance, you
  113. can remove the block labeled <code>location /</code> from the example above,
  114. change <code>location /cat-soop</code> to be <code>location /</code>, and
  115. comment out the <code>rewrite</code> line within that block.
  116. </div>
  117. <subsection>(Optional) Client Certificate Authentication</subsection>
  118. If you would like to enable authentication based on client certificates (instead of username/password), add the following line beneath the other `ssl_` configuration variables in the NGINX configuration file:
  119. ```
  120. ssl_client_certificate /path/to/client_ca.pem;
  121. ssl_verify_client on;
  122. ```
  123. where `/path/to/client_ca.pem` is the location on fisk of the CA with which client certificates are signed.
  124. <section>(Optional) Configure Workers</section>
  125. <subsection>Web Server</subsection>
  126. By default, CAT-SOOP uses [cheroot](https://github.com/cherrypy/cheroot) as its
  127. WSGI server. However, for public-facing instances, we recommend using [uWSGI](https://uwsgi-docs.readthedocs.io/en/latest/) instead of cheroot. To
  128. do so, set `cs_wsgi_server = 'uwsgi'` in your `config.py` file. To make uWSGI
  129. spawn multiple worker processes, set the `cs_wsgi_server_min_processes` and
  130. `cs_wsgi_server_max_processes` variables. When using uWSGI, you do not need to
  131. do any special NGINX configuration for load balancing.
  132. Alternatively, if you prefer to use `cheroot`, you can cause CAT-SOOP to launch
  133. more than one worker by setting `cs_wsgi_server_port` to a _list_ of integers
  134. instead of a single integer. In this case, you will also likely want to
  135. configure NGINX to balance the load between the different processes by
  136. following the instructions [on this
  137. page](http://nginx.org/en/docs/http/load_balancing.html) (and, importantly,
  138. including the `ip_hash;` directive so users' sessions are not lost).
  139. <subsection>Checker</subsection>
  140. By default, CAT-SOOP's checker will run at most 1 check at a time. If you have
  141. the resources available, you can configure the checker to run multiple checks
  142. in parallel by setting `cs_checker_parallel_checks` to a larger (integer)
  143. number in your `config.py`.
  144. <section>Start CAT-SOOP</section>
  145. From within the `scripts` directory of the CAT-SOOP source, run the following
  146. command to start CAT-SOOP:
  147. ```
  148. $ catsoop start
  149. ```
  150. On a typical webserver, it is a good idea to run the command in a screen so
  151. that the process does not die when you hang up. Alternatively, you can use
  152. `nohup`. For example,
  153. ```
  154. $ nohup catsoop start > /dev/null &
  155. ```
  156. <section>Test Configuration</section>
  157. Direct your web browser to your `cs_url_root` and you should now see the
  158. CAT-SOOP default page!
  159. <section>(Optional) Configure Backups</section>
  160. All of CAT-SOOP's data are stored in files on disk in a directory called
  161. `__LOGS__` in the `cs_data_root` location specified above (by default,
  162. `~/.local/share/catsoop`). CAT-SOOP itself will not back these files up, but
  163. there are many strategies for backups using common utilities.
  164. I have used many approaches in the past, but my usual approach involves setting
  165. the `__LOGS__` directory up as a Mercurial or Git repository, and then setting
  166. up a cron job to commit all files in that repository and push to several
  167. locations. This approach has several advantages over simply using `rsync` or
  168. `scp` to copy the folder to a remote machine. In particular, it allows you to
  169. roll back to any past backup while keeping size down by only storing diffs
  170. (instead of storing a complete copy of each file for each backup).
  171. Here, we'll set up a backup using Mercurial (which tends to be better than Git
  172. at efficiently storing the binary logs without manual intervention). To set
  173. this up, first move yourself to the `__LOGS__` directory and run `hg init`,
  174. followed by `hg add .`. This will set your `__LOGS__` directory up as a
  175. Mercurial repository. You can then set up a cron job to commit all changes and
  176. push these changes to an arbitrary number of backup locations (local or
  177. remote).
  178. The following example script (`/home/catsoop/do_backup.sh`) was used by several
  179. classes in fall 2018. It commits local changes to a Mercurial repository, and
  180. it then pushes those changes to one local location (on a separate disk) and to
  181. one remote location.
  182. ```bash
  183. #!/bin/bash
  184. cd /home/catsoop/.local/share/catsoop/_logs;
  185. hg addremove .;
  186. hg commit -m "$(date +'%Y-%m-%d:%H:%M')";
  187. hg push /storage2/backup;
  188. hg push ssh://catsoop@catsoop.org/backups/py;
  189. ```
  190. It can then be configured to run, for example, every hour at xx:05 and xx:35
  191. with the following crontab entry:
  192. ```nohighlight
  193. 5,35 * * * * /usr/bin/flock -n /tmp/backup.lockfile /home/catsoop/do_backup.sh 2>&1 >/dev/null
  194. ```
  195. <section>(Optional) Set Up Local Python Sandbox</section>
  196. By default, Python code that needs to be sandboxed (for example, student code
  197. from the `pythonic` or `pythoncode` question types) will be sent to
  198. `catsoop.org` to be run.
  199. It is fine to leave things this way if you'd like. I will keep that service up
  200. as long as is feasible, and the sandbox doesn't log anything about the code it
  201. runs. That said, you may also wish to set things up so that the code runs on
  202. your machine. The main benefit of this approach is that you don't have to rely
  203. on an external service (network issues or our server's downtime won't affect
  204. you, and you have a sandbox to yourself instead of having to share with
  205. others).
  206. Our recommended sandboxing approach involves creating a Python virtual
  207. environment to run student code, and limiting that interpreter's permissions
  208. using [AppArmor](http://wiki.apparmor.net/index.php/Main_Page) and
  209. [bubblewrap](https://github.com/projectatomic/bubblewrap). This approach will
  210. largely isolate the student code from the system on which it is running, and it
  211. will also limit other resources (memory usage, etc).
  212. <subsection>Installing Necessary Software</subsection>
  213. In order to set things up, you'll need to install both `virtualenv` and
  214. `AppArmor`. On Debian Stretch, this can be done with the following commands:
  215. ```
  216. $ sudo pip3 install virtualenv
  217. $ sudo apt install apparmor apparmor-utils apparmor-profiles
  218. ```
  219. You'll also need to install `bubblewrap`. The version of `bubblewrap` that is
  220. available in the Debian Stretch repositories does not support some of the
  221. features we want to use, so you should compile from source. You can do so with
  222. the following sequence of commands (on Debian Stretch):
  223. ```
  224. $ sudo apt build-dep bubblewrap
  225. $ git clone https://github.com/projectatomic/bubblewrap
  226. $ cd bubblewrap
  227. $ ./autogen.sh
  228. $ make
  229. $ sudo make install
  230. ```
  231. This will make an executable called `bwrap`, which our sandbox will use.
  232. On Debian, you will also need to set a kernel parameter to allow unprivileged
  233. users to create new user namespaces:
  234. ```
  235. $ sudo sysctl kernel.unprivileged_userns_clone=1
  236. ```
  237. You should also set this parameter in `/etc/sysctl.conf` so it persists across
  238. reboots.
  239. <subsection>Virtual Environment</subsection>
  240. Now that we have all of the necessary software, we'll set up a virtual
  241. environment. The sandboxed code will be run in this environment. Pick a
  242. location (one that is readable by the user running the web server) and create a
  243. new virtual environment there with the Python interpreter you want the checkers
  244. to use. In example below, we'll use the `/usr/bin/python3` interpreter, and
  245. we'll set up the virtual environment in `/home/catsoop/python3_sandbox`.
  246. ```
  247. $ virtualenv --always-copy -p /usr/bin/python3 /home/catsoop/python3_sandbox
  248. ```
  249. If you want to use a different Python version as the basis for the virtual
  250. environment, you can change the `-p` option.
  251. <subsubsection>Installing Packages to the Sandbox</subsubsection>
  252. If you would like your checkers to be able to use any packages outside the
  253. standard library, you can install them in the virtual environment using the
  254. `pip` executable within the virtual environment. For most packages, you can
  255. simply use the `pip` executable from this new virtual environment to install
  256. them. For example, to make `pillow` available within the sandbox, we could
  257. use:
  258. ```
  259. $ /home/catsoop/python3_sandbox/bin/pip install pillow
  260. ```
  261. However, some packages require special care when installing. For example,
  262. `numpy` normally uses multiple processes when computing its results. However,
  263. a desirable feature of the sandbox is that it prevents student code from
  264. launching new processes of any kind. To get around this, it is possible to
  265. compile `numpy` for the sandbox with all optimizations disabled, for example:
  266. ```
  267. $ sudo apt build-dep python3-numpy
  268. $ wget https://files.pythonhosted.org/packages/94/b8/09db804ddf3bb7b50767544ec8e559695b152cedd64830040a0f31d6aeda/numpy-1.14.4.zip
  269. $ unzip numpy-1.14.4.zip
  270. $ cd numpy-1.14.4
  271. $ BLAS=None LAPACK=None ATLAS=None /home/catsoop/python3_sandbox/bin/python3 setup.py install
  272. ```
  273. <subsection>AppArmor</subsection>
  274. We'll use AppArmor to place some limits on our sandboxed Python interpreter.
  275. Before we can do so, we'll have to configure the Linux kernel to use AppArmor
  276. for security. You can do this my modifying `/etc/default/grub`. Within that
  277. file, you'll need to modify a line starting with `GRUB_CMDLINE_LINUX_DEFAULT`
  278. by adding `apparmor=1 security=apparmor` to the end of the arguments in quotes.
  279. For example, after making this modification, this line appears on my machine
  280. as:
  281. ```
  282. GRUB_CMDLINE_LINUX_DEFAULT="quiet apparmor=1 security=apparmor"
  283. ```
  284. After making this modification, you'll need to update GRUB and reboot for the
  285. changes to take effect:
  286. ```
  287. $ sudo update-grub
  288. $ sudo reboot
  289. ```
  290. After the reboot, you'll need to set up an AppArmor profile to limit your
  291. virtual environment's Python interpreter. Create a file
  292. `/etc/apparmor.d/py3sandbox` containing the following, but replacing
  293. `/home/catsoop/python3_sandbox` with your sandbox location (if it is
  294. different), and tuning some of the other parameters if necessary:
  295. ```
  296. #include <tunables/global>
  297. /home/catsoop/python3_sandbox/bin/python3.6 {
  298. /** wrix,
  299. set rlimit nproc <= 0,
  300. set rlimit fsize <= 1M,
  301. set rlimit as <= 500M,
  302. }
  303. ```
  304. This file does a couple of things:
  305. * It allows access to the entire filesystem. This might seem dangerous, but we'll use `bwrap` to handle the filesystem sandboxing (though you can modify the entries above to further restrict things).
  306. * It also introduces two resource limits:
  307. * student code will not be allowed to spawn any new processes
  308. * student code cannot write more than 1MB of data to files
  309. * student code will not be allowed to use more than 500MB of memory
  310. All of these parameters are tunable, and other resources can also be limited,
  311. as documented [here](https://linux.die.net/man/2/setrlimit).
  312. Finally, enable the profile with the following command:
  313. ```
  314. $ sudo aa-enforce /etc/apparmor.d/py3sandbox
  315. ```
  316. You can then test your setup by running the Python interpreter (in our example,
  317. `/home/catsoop/python3_sandbox/bin/python3`) and trying to write more than 1M
  318. of data to a file:
  319. ```py
  320. with open('/tmp/test', 'w') as f:
  321. f.write('a'*(1204**2+1))
  322. ```
  323. This should produce an error, since this interpreter is not allowed to write
  324. that much data to disk.
  325. <div class="callout callout-info">
  326. <h4>Note</h4>
  327. If you did use AppArmor to place additional restrictions on filesystem access,
  328. and if you later wish to install other Python packages for the sandboxed
  329. interpreter, you will first need to disable the AppArmor protections by
  330. running:
  331. <pre>
  332. $ sudo aa-disable /etc/apparmor.d/py3sandbox
  333. </pre>
  334. Then you can install the packages using the `pip` executable within the virtual
  335. environment, and re-enable the AppArmor protections afterwards by running:
  336. <pre>
  337. $ sudo aa-enforce /etc/apparmor.d/py3sandbox
  338. </pre>
  339. </div>
  340. </abstractions></abstractions></tunables>
  341. <subsection>CAT-SOOP Configuration</subsection>
  342. Now that we have those pieces set up, we'll need to configure CAT-SOOP to use
  343. this new sandbox.
  344. Add the following to your `preload.py` (so that all pages in the course inherit
  345. it), substituting your own values where appropriate:
  346. ```python
  347. csq_python_sandbox = 'bwrap'
  348. # the following should match the line in /etc/apparmor.d/py3sandbox exactly
  349. csq_python_interpreter = '/home/catsoop/python3_sandbox/bin/python3.6'
  350. csq_bwrap_extra_ro_binds = [('/home/catsoop/python3_sandbox', '/home/catsoop/python3_sandbox')]
  351. ```
  352. The `csq_bwrap_extra_ro_binds` variable tells bubblewrap to mount certain
  353. directories from the base system on the virtual filesystem available to the
  354. student's code in rea-only mode. In our case, it is necessary to include the
  355. directory from which our Python executable is available.
  356. And that's it! It is worth runing a few tests after implementing this, to make
  357. sure things are working properly. For example, I would usually try to:
  358. * call `os.fork()` and/or use the `subprocess` module to start a child process
  359. * write too much data to a file
  360. * list the files in a directory not included in the sandbox (e.g., someone's home directory)
  361. * use too much memory
  362. * cause an infinite loop
  363. If the system properly stops the code from running in all of the examples above
  364. but works for a correct solution, then you're probably in good shape!