How can I mitigate the risks of intentionally allowing users to run code on a server, in a resource-efficient manner? Posted: 10 Oct 2021 10:35 PM PDT To summarize: I want to run multiple servers on a host, for different groups of users, and allow them to add and modify functionality of the servers at runtime with some form of scripting. This is a 100% necessary feature for what I want to do, but there is obviously a lot of potential for this to go horribly wrong; what are some actions/approaches I can take to mitigate the risks involved? It's probably worth mentioning that I am not a system administrator and am most likely unaware of many best practices taken for granted by those that are; my ideas are based on what seems logical to me given what little I know - if any of them seem misguided or if I'm missing some important ideas, please assume I am ignorant of what should be done and let me know. To elaborate: Users will connect to a master server, which will handle authentication and facilitate accessing their scriptable servers of choice. I'm expecting that users will establish a connection to a scriptable server and afterward communicate directly to it for a session; that seems logical from the point of view of an application developer, I'm not sure if it's a bad security practice? All the servers will run linux and be run on a linux host. The master server will have a database, probably PostgreSQL, with some user-related data e.g. for authentication. It will probably also have some data regarding the scriptable servers so it can advertise them and handle connecting users to them. The scriptable servers will need some user-related data as well, but mostly will contain user-created content; each scriptable server will store that data in an SQLite database. Users should not be able to run any code on the master server. On the scriptable servers, I envision them using a language like Tcl or Lua, which can be embedded and allow exposing only part of their functionality to users. Tcl does this via a "safe interpreter"; Lua apparently has some sandboxing capability. I don't expect these language features to completely protect my servers from being compromised. I am considering running all the servers in their own rootless docker containers. I know this is not enough to contain a compromise, but my understanding is that it should help. The next part is where I am especially unsure what to do: Running each server in its own virtual machine with VirtualBox or the like would provide what I imagine would be very good isolation, however it would also consume a lot of resources; I need to avoid that, or else find a linux distro that only consumes something like 5-20 MB of RAM, which seems unlikely to me and would still likely be way more resources than the servers themselves would use, which would be very unappealing. Is there a good answer or alternative to this? Ultimately, while I hope to avoid any system being compromised, I expect it to happen at some point or another; if somebody wants to do that, they're going to figure it out. I'm looking for a resource-efficient way to make it difficult for a single compromised server to lead to others being compromised; even better would be if I could keep the compromised server from being used to wreak havoc across the Internet. I'd also like to have a good way to recover from this; I intend to take backups and send them elsewhere, but I'd be interested in any other suggestions - I could see that being another question, however. |
open-ssh server doesn't allow multiple logins on a LAN when connecting to public IP Posted: 10 Oct 2021 10:10 PM PDT I have an open-ssh server setup on Linux Mint with ssh keys and a ~/.ssh/config file on my Mac where I do most of my workflow. If I am at home on a LAN with the server and run ssh my_username@my_public_IP it works the first time, but hangs the second time. However, the same command when not on the LAN works as expected, allowing me to make multiple ssh connections simultaneously. Why does this happen, and how can I mitigate this behavior so I don't need to run a different ssh command depending on where I am? |
Unable to scrape kublet api from prometheus Posted: 10 Oct 2021 09:53 PM PDT I am setting up prometheus to scrape kubernetes cluster. I am trying to use "role: node" with kubernetes_sd_config to monitor one of the K8s cluster. I created certificate ashishcert.pem for user "ashish" and prometheus will use this cert to scrape the cluster. This certificate is signed by cluster CA. Prometheus.yml Now when i look back in my prometheus, it says "cannot validate certificate x.x.x because it does not contain any IP SAN's" result on prometheus side The port no given in image is for kublet and that means its unable to scrap kublet metrics for all the nodes in cluster. Though i have added all the node names and IPs in SAN of certificate. i validated my certificate by checking metrics of apisever using my cert and CA cert with below command. curl -v https://myclustername:6443/metrics --cacert ca.pem --cert ashishcert.pem --key ashishkey.pem And the above command worked successfully. my cert was accepted by apiserver. However when i tried to curl kublet metrics with path https://myclustername:10250/metrics. it gave me an error saying CA is not trusted. looks like kublet CA is different than apiserver CA. result while doing curl I had understanding that my certificate will connect me (prometheus) to apiserver and then its apiserver duty for all further communications like apiserver will use its certificate to get the metrics from kublet. However with results of above commands, looks like mycert is being authenticated directly with kublet also. Please confirm whose certificate will be used for internal communications. How to scrape all the nodes with role: node without ignoring certificates? |
Does Azure Application Insights/Monitor have a way to check uptime of external REST APIs? Posted: 10 Oct 2021 08:49 PM PDT with AI, you can ping websites from different regions, like pingdom. However, we are looking for a way to call external (not hosted in Azure) rest endpoints, ideally being able to take the output of one, extract a token, then use it in the parameters of another. This can be done with Synthetics in NewRelic (extremely expensive), or "advanced" checks in pingdom (doesn't support UK as a source), but we would prefer to do it via Azure. We could setup a VM, and run curl from a shell script, but this is a poor solution. This is not to be confused with using REST apis to access azure monitor, it's the opposite. |
How to port forward on an ATT Router/Modem? Posted: 10 Oct 2021 08:38 PM PDT I am trying to open port 2222 for an open-ssh server on my linux machine. I am able to log into it just fine from my local IP address. sudo systemctl status returns: Loaded: loaded (/lib/systemd/system/ssh.service; enabled; vendor preset: enabled) Active: active (running) since Sun 2021-10-10 19:25:19 PDT; 34min ago Docs: man:sshd(8) man:sshd_config(5) Process: 9445 ExecStartPre=/usr/sbin/sshd -t (code=exited, status=0/SUCCESS) Main PID: 9446 (sshd) Tasks: 1 (limit: 19025) Memory: 3.6M CGroup: /system.slice/ssh.service └─9446 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups Oct 10 19:25:19 jacob-desktop systemd[1]: Starting OpenBSD Secure Shell server... Oct 10 19:25:19 jacob-desktop sshd[9446]: Server listening on 0.0.0.0 port 2222. Oct 10 19:25:19 jacob-desktop sshd[9446]: Server listening on :: port 2222. Oct 10 19:25:19 jacob-desktop systemd[1]: Started OpenBSD Secure Shell server. Oct 10 19:49:54 jacob-desktop sshd[9648]: Accepted publickey for jacob from 192.168.1.220 port 53539 ssh2: ED25519 SHA256:9DMi> Oct 10 19:49:54 jacob-desktop sshd[9648]: pam_unix(sshd:session): session opened for user jacob by (uid=0) But according to https://www.portchecktool.com/ my port 2222 is closed. My att port config is as follows: This one doesn't work What is so frustrating is that a port I opened using a different tool within the modem works, as shown below: This one works I don't see why port 22 should be open and work whereas port 2222 does not. Is Att's modem bad, or am I making some error I don't see? Thank anyone who helps with this, I've been ripping out my hair for months on various server-building attempts. |
How do I shrink my EBS Volume with out loosing my site data Posted: 10 Oct 2021 07:29 PM PDT I would like to reduce my ebs volume size without loosing my site. I already tried to take snap shots and creating another smaller volume but all methods I try involve using commands which I would not like to use. Anyone with a more easier way please help |
clamav - clamd error when setup as daemon (mac osx) Posted: 10 Oct 2021 07:10 PM PDT Issue: Setting up clamav as a daemon process in mac osx throws some cumbersome errors and warnings while doing the setup and the documentation is good, but not perfect. I ran into some permission issues, file location issues, etc. Things that are working: - freshclam daemon via a cronjob(will post below)
- getting the daemon to load via launchd and show via
sudo launchctl list | grep clam - starting the daemon via launchd*
Things that are not working: - clamd created from launchd plist does not stay in list after starting
- clamd starts, but returns the error below
Error: clamdclam.log: ERROR: LOCAL: Socket file /usr/local/etc/clamav/clamd.socket is in use by another process. Setup: CONFIG_DIR="/usr/local/etc" CLAM_HOME_DIR=~/clamav # Make dir for configs in home dir mkdir -p ${CLAM_HOME_DIR} # Create configs clamconf -g freshclam.conf > ${CLAM_HOME_DIR}/freshclam.conf clamconf -g clamd.conf > ${CLAM_HOME_DIR}/clamd.conf clamconf -g clamav-milter.conf > ${CLAM_HOME_DIR}/clamav-milter.conf # Link configs ln -nsf $(pwd)/freshclam.conf /usr/local/etc/clamav/ ln -nsf $(pwd)/clamd.conf /usr/local/etc/clamav/ ln -nsf $(pwd)/clamav-milter.conf /usr/local/etc/clamav/ # Test freshclam is working # create freshclam a log file sudo touch /var/log/freshclam.log sudo chmod 600 /var/log/freshclam.log sudo chown clamav /var/log/freshclam.log # create Clamd Log file sudo touch /var/log/clamdclam.log sudo chmod 600 /var/log/clamdclam.log sudo chown clamav /var/log/clamdclam.log Files: All configs and functional files /usr/local/etc/clamav: ls -l /usr/local/etc/clamav/ total 472256 -rw-r--r-- 1 _clamav admin 293670 Oct 10 17:35 bytecode.cvd lrwxr-xr-x 1 user admin 37 Oct 10 17:14 clamav-milter.conf -> /Users/user/clamav/clamav-milter.conf lrwxr-xr-x 1 root admin 29 Oct 10 20:48 clamd.conf -> /Users/user/clamav/clamd.conf -rwxrwxr-x 1 user admin 26784 Oct 9 16:46 clamd.conf.sample -rw-r--r-- 1 root wheel 5 Oct 10 21:09 clamd.pid srw-rw---- 1 root wheel 0 Oct 10 20:59 clamd.socket lrwxr-xr-x 1 user admin 31 Oct 10 19:25 clamd_run.sh -> /Users/user/clamav/clamd_run.sh -rw-r--r-- 1 _clamav admin 56261254 Oct 10 17:34 daily.cvd lrwxr-xr-x 1 user admin 33 Oct 10 17:14 freshclam.conf -> /Users/user/clamav/freshclam.conf -rwxrwxr-x 1 user admin 7204 Oct 9 16:46 freshclam.conf.sample -rw-r--r-- 1 _clamav _clamav 69 Oct 10 17:34 freshclam.dat -rw-r--r-- 1 _clamav admin 170479789 Oct 10 17:35 main.cvd mac osx plist file /Library/LaunchDaemons/com.clamd.daemon.plist <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>com.clamav.daemon</string> <key>ProgramArguments</key> <array> <string>/usr/local/Cellar/clamav/0.104.0_1/sbin/clamd</string> <string>-c</string> <string>/usr/local/etc/clamav/clamd.conf</string> <string>-l</string> <string>/var/log/clamdclam.log</string> </array> <key>KeepAlive</key> <dict> <key>Crashed</key> <true/> </dict> <key>StandardOutPath</key> <string>/tmp/test.stdout</string> <key>StandardErrorPath</key> <string>/tmp/test.stderr</string> <key>RunAtLoad</key> <true/> <key>LaunchOnlyOnce</key> <true/> </dict> </plist> Currently Testing: - changed file ownership: was
user:wheel -> root:wheel -> root:admin srw-rw---- 1 root wheel 0 Oct 10 20:59 clamd.socket |
Best way to do simple new employee laptop setup for Windows 10/11 without Windows Server or cloning Posted: 10 Oct 2021 08:41 PM PDT (Recommended I move here from SuperUser) I ran a few Windows based labs over a decade ago, past few years I've mostly been managing Macs and systems in AWS and GCP. Right now the company is hiring a lot of finance folks who want to work on Windows laptops. The basic setup is I setup a local admin account so we always can get back into the machine, add a user account for the new employee, and install basic apps like Google Chrome, Office, Zoom, etc. nothing too fancy. We don't have Active Domain control running, don't have SCCM at this time. With this context, I was wondering if there's a way to do a simple automated install to new laptops similar to MacOS Migration Assistant, where a new laptop already has a fresh Windows OS from the factory, and I just want to transfer the user account and additional apps installed. I did some research and mostly saw more elaborate enterprise options using SCCM and Windows Server etc. |
Ldap service not running on Windows Server 2019 Posted: 10 Oct 2021 06:42 PM PDT I have 2 windows server 2019. e.g. server1 and server2. server1 is the domain controller. server1 has below roles installed: ADDS, ADCS, DNS, FILE STORAGE, IIS. server2 is connected to that domain controler. server1 has below roles installed: ADCS, FILE STORAGE, IIS. I have setup PKI on server1 and everything works fine. I am able to use CRL as well as OCSP feature for certificate validation. I wanted to make server2 as subordinate CA of server1(root CA), and installed corresponding roles(ADCS) and able to distribute user certificate and its working fine. But I am not able to test CRL functionality on server2 as it required ldap binding with server2. As I debugged it further, I found that LDAP server is not running on server2. I checked port 389 is listening on server1 but not server2. So how to enable ldap service on server2 ? I am not able to test CRL functionality of PKI, because CDP url is ldap address. |
Create multiple directories with mode and loop via ansible [SOLVED] Posted: 10 Oct 2021 09:56 PM PDT I'm trying to play with loop and ask in a playbook to ansible creating multiple directories on a server with specific attributes : mode owner group . I think i'm close but don't get it working. I get this error: Unsupported parameters for (file) module: recursive Supported parameters include: _diff_peek, _original_basename, access_time, access_time_format, attributes, backup, content, delimiter, directory_mode, follow, force, group, mode, modification_time, modification_time_format, owner, path, recurse, regexp, remote_src, selevel, serole, setype, seuser, src, state, unsafe_writes Any advice would be very appreciated : Here is the playbook sample : - name: ansible create directory with_items example file: path: "{{ item.dest }}" mode: "{{item.mode}}" owner: "{{item.owner}}" group: "{{item.group}}" recursive: true state: directory loop: - { dest: '/var/lib/tftpboot/os/uefi/debian11', mode: '0744', owner: 'root', group: 'root' } - { dest: '/var/lib/tftpboot/os/uefi/ubuntu2004D', mode: '0744', owner: 'root', group: 'root'} - { dest: '/var/lib/tftpboot/os/uefi/f34w', mode: '0744', owner: 'root', group: 'root'} - { dest: '/var/lib/tftpboot/os/uefi/f34s', mode: '0744', owner: 'root', group: 'root'} - { dest: '/srv/nfs/isos', mode: '0744', owner: 'root', group: 'rpcuser'} - { dest: '/srv/nfs/pxe/debian11', mode: '0744', owner: 'root', group: 'rpcuser'} - { dest: '/srv/nfs/pxe/ubuntu2004', mode: '0744', owner: 'root', group: 'rpcuser'} - { dest: '/srv/nfs/pxe/f34w', mode: '0744', owner: 'root', group: 'rpcuser'} - { dest: '/srv/nfs/pxe/f34s', mode: '0744', owner: 'root', group: 'rpcuser'} - { dest: '/tmp/debian11', mode: '0744', owner: 'root', group: 'root'} - { dest: '/tmp/f34w', mode: '0744', owner: 'root', group: 'root'} - { dest: '/tmp/ubuntu2004D', mode: '0744', owner: 'root', group: 'root'} |
SQUID Transparent Proxy: Error INVALID_URL and ACCESS_DENIED Posted: 10 Oct 2021 10:45 PM PDT I configure squid proxy on Centos 7. I am using Squid version 3.5.20. I also try squid 4.10 on Ubuntu 20.04, but I got the same problem. Maybe my ACL was wrong. I configure DSTNAT on Router to intercept HTTP traffic from 192.168.1.0/24 to Squid Proxy 10.10.10.10:3128. This is /etc/squid/squid.conf file: acl localnet src 10.0.0.0/8 # RFC1918 possible internal network acl localnet src 172.16.0.0/12 # RFC1918 possible internal network acl localnet src 192.168.0.0/16 # RFC1918 possible internal network acl whitelist_domain dstdomain "/etc/squid/whitelist.acl" http_access allow localnet http_access allow localhost http_access allow whitelist_domain http_access deny all http_port 3128 coredump_dir /var/spool/squid refresh_pattern ^ftp: 1440 20% 10080 refresh_pattern ^gopher: 1440 0% 1440 refresh_pattern -i (/cgi-bin/|\?) 0 0% 0 refresh_pattern . 0 20% 4320 And this is the /etc/squid/whitelist.acl file: linux.or.id lipi.go.id Please help me to find the problem. So,regarding the above config, the client will be denied to access all http website, except linux.or.id and lipi.go.id. Right? However, when I try to connect. All website has this error: INVALID URL This is /var/log/squid/access.log 1633885185.900 0 192.168.1.251 TAG_NONE/400 3867 GET / - HIER_NONE/- text/html 1633885185.970 0 192.168.1.251 TCP_IMS_HIT/304 295 GET http://linux:3128/squid-internal-static/icons/SN.png - HIER_NONE/- image/png I was trying to change the squid.conf: http_port 3128 intercept http_port 3129 However, I got ERROR ACCESS DENIED, which mean my ACL blocked the access right? |
GCP data centre fire safety details Posted: 10 Oct 2021 06:06 PM PDT I've been asked by a customer to provide fire safety details for Google Cloud Platform. They require it for their procurement data security policies. We use the europe_west2 London region to host their services and data. Is there any way to find out the fire safety details of this, or other data centres? I've tried extensively to find it through the GCP console and documentation and I've drawn a blank, and there is no contact available with the basic support plan that we're currently on. many thanks |
RemainAfterExit in Upstart Posted: 10 Oct 2021 07:17 PM PDT Is there an Upstart equivalent to systemd's RemainAfterExit? I have an upstart task that exec's a script that completes quickly when the task is started. However, I would still like that task to report as active so that I can subsequently 'stop' the task and have it execute a cleanup script. In systemd, I would do the following: [Service] Type=oneshot RemainAfterExit=true ExecStart=/usr/local/bin/my_script.sh create %i ExecStop=/usr/local/bin/my_script.sh delete %i How would I do the same thing in Upstart? |
How can I rsync a rpm link? Posted: 10 Oct 2021 07:13 PM PDT I want to download all the packages that it is in here. But I do not want to download them one by one. How can I do that in rsync? Thanks! |
SIGKILL has no effect on a process running 100% CPU Posted: 10 Oct 2021 10:07 PM PDT I have a weird behaiviour on my pi4 running Ubuntu server 21.04. It's running correctly, but at after a while, a can see a process running 100% CPU from hours, and if I wait longer there are 2, 3 ... other processes running 100% CPU. They seem to be launched by a cron job (from the home automation Jeedom), but this is not my question. The weird thing is I cannot kill them, even root user with kill -9 . The process is running R, but not responding. #ps aux | grep 46149 www-data 46149 99.7 0.0 2040 80 ? R Oct04 633:33 sh -c (ps ax || ps w) | grep -ie "cron_id=7$" | grep -v "grep" #sudo kill -9 46149 #ps aux | grep 46149 www-data 46149 99.7 0.0 2040 80 ? R Oct04 633:36 sh -c (ps ax || ps w) | grep -ie "cron_id=7$" | grep -v "grep" In this example, the blocked process is 'ps', but this is not always the same. If a power off the pi, it restarts normally, but another blocked process will appear after a while. And I need to power off, because 'reboot' will not work. Edit: Using 'ps axjf' to see process tree 1 7317 7317 1799 ? -1 Sl 0 0:56 /usr/bin/containerd-shim-runc-v2 -namespace moby -id bf40089312cdb1d7707096fe6fc46520c7c1a17a70eac305473761976c1f4b7d -address /run/cont 7317 7337 7337 7337 ? -1 Ss 0 1:12 \_ /usr/bin/python2 /usr/bin/supervisord -c /etc/supervisor/supervisord.conf 7337 7391 7391 7337 ? -1 S 0 0:02 | \_ /usr/sbin/cron -f -L4 7391 104917 7391 7337 ? -1 S 0 0:00 | | \_ /usr/sbin/CRON -f -L4 104917 104919 104919 104919 ? -1 Ss 0 0:00 | | | \_ /bin/sh -c /usr/bin/php /var/www/html/core/php/watchdog.php >> /dev/null 104919 104920 104919 104919 ? -1 R 0 1521:41 | | | \_ /bin/sh -c /usr/bin/php /var/www/html/core/php/watchdog.php >> /dev/null 7391 395309 7391 7337 ? -1 S 0 0:00 | | \_ /usr/sbin/CRON -f -L4 395309 395312 395312 395312 ? -1 Ss 33 0:00 | | \_ /bin/sh -c /usr/bin/php /var/www/html/core/php/jeeCron.php >> /dev/null 395312 395313 395312 395312 ? -1 S 33 0:00 | | \_ /usr/bin/php /var/www/html/core/php/jeeCron.php 395313 395341 395312 395312 ? -1 S 33 0:00 | | \_ sh -c (ps ax || ps w) | grep -ie "cron_id=4$" | grep -v "grep" 395341 395344 395312 395312 ? -1 R 33 109:29 | | \_ sh -c (ps ax || ps w) | grep -ie "cron_id=4$" | grep -v "grep" 7337 7392 7392 7337 ? -1 S 1 0:00 | \_ /usr/sbin/atd -f 7337 8613 8613 7337 ? -1 Sl 0 6:16 | \_ /usr/bin/python3 /usr/bin/fail2ban-server -fc /etc/fail2ban/ 7337 11223 10184 10184 ? -1 S 33 0:08 | \_ php /var/www/html/core/class/../php/jeeCron.php cron_id=452778 7337 18465 18465 18465 ? -1 SNs 0 0:08 | \_ /usr/sbin/apache2 -k start 18465 168788 18465 18465 ? -1 SN 33 0:48 | | \_ /usr/sbin/apache2 -k start 18465 354445 18465 18465 ? -1 SN 33 0:27 | | \_ /usr/sbin/apache2 -k start 18465 356077 18465 18465 ? -1 SN 33 0:24 | | \_ /usr/sbin/apache2 -k start 18465 356301 18465 18465 ? -1 SN 33 0:25 | | \_ /usr/sbin/apache2 -k start 18465 362824 18465 18465 ? -1 SN 33 0:16 | | \_ /usr/sbin/apache2 -k start 18465 364208 18465 18465 ? -1 SN 33 0:14 | | \_ /usr/sbin/apache2 -k start 18465 366422 18465 18465 ? -1 SN 33 0:12 | | \_ /usr/sbin/apache2 -k start 18465 366848 18465 18465 ? -1 SN 33 0:12 | | \_ /usr/sbin/apache2 -k start 18465 367416 18465 18465 ? -1 SN 33 0:10 | | \_ /usr/sbin/apache2 -k start 18465 367576 18465 18465 ? -1 SN 33 0:11 | | \_ /usr/sbin/apache2 -k start 18465 405605 18465 18465 ? -1 SN 33 0:03 | | \_ /usr/sbin/apache2 -k start 7337 18824 18465 18465 ? -1 SN 33 174:59 | \_ php /var/www/html/core/class/../php/jeeCron.php cron_id=301554 7337 35774 18465 18465 ? -1 SNl 33 0:31 | \_ node /var/www/html/plugins/alexaapi/resources/alexaapi.js http://app_jeedom amazon.fr alexa.amazon.fr OtAkaDFZj3YlSEQg6T1VGk8Jq8 7337 44738 44738 44738 ? -1 SNs 106 0:00 | \_ /usr/bin/dbus-daemon --system 7337 44767 44766 44766 ? -1 SN 107 1:13 | \_ avahi-daemon: running [bf40089312cd.local] 44767 44768 44766 44766 ? -1 SN 107 0:00 | | \_ avahi-daemon: chroot helper 7337 45616 18465 18465 ? -1 SNl 33 4:20 | \_ homebridge 45616 45664 18465 18465 ? -1 SNl 33 2:10 | | \_ homebridge-config-ui-x 7337 46149 46102 46102 ? -1 R 33 1931:04 | \_ sh -c (ps ax || ps w) | grep -ie "cron_id=7$" | grep -v "grep" 7337 407386 18465 18465 ? -1 RN 33 0:00 | \_ php /var/www/html/core/class/../php/jeeListener.php listener_id=2 event_id=379484 value='1310' datetime='2021-10-06 06:36:25' 7317 22607 22607 22607 ? 22607 Ss+ 0 0:00 \_ /bin/bash Edit I tried to kill parent: every level of the process tree has been killed, except the parent and the blocked process (2 processes this time with the same parent). Now I have root 5790 0.0 0.0 0 0 ? Ss Oct09 0:14 \_ [sh] www-data 267740 99.4 0.0 2040 84 ? RN 05:05 1032:49 \_ sh -c ps ax | grep "resources/alexaapi.js" | grep -v "grep" | wc -l HOME=/var/www LOGNAME=www-data PATH=/usr/bin:/bin SHELL=/bi www-data 357120 99.5 0.0 2040 80 ? RN 14:00 501:07 \_ sh -c (ps ax || ps w) | grep -ie "cron_id=469432$" | grep -v "grep" HOME=/var/www LOGNAME=www-data PATH=/usr/bin:/bin SHELL=/bin And with 'ps-ef': root 5790 5760 0 Oct09 ? 00:00:14 [sh] www-data 267740 5790 99 Oct10 ? 1-01:58:16 sh -c ps ax | grep "resources/alexaapi.js" | grep -v "grep" | wc -l www-data 357120 5790 99 Oct10 ? 17:06:33 sh -c (ps ax || ps w) | grep -ie "cron_id=469432$" | grep -v "grep" |
WHfB - Hybrid Certificate Trust - Failed provisioning Posted: 10 Oct 2021 07:03 PM PDT After setting up Windows Hello for Business, in a Hybrid Azure AD joined Certificate Trust Deployment scenario, i ended up with the following events in my test client machine after a failed provisioning. I reviewed my setup, but i must be missing something. Any help would be highly appreciated. ############################## Microsoft-Windows-AAD/Operational TimeCreated : 13/05/2020 11:57:04 Id : 1082 Message : Key error: DecodingProtectedCredentialKeyFatalFailure Error description: AADSTS9002313: Invalid request. Request is malformed or invalid. Trace ID: 834deec1-21d8-48c2-bae5-7f795e312f00 Correlation ID: 88bc2dda-ba2a-42dc-a9da-7b9f362f7d7a Timestamp: 2020-05-13 22:57:04Z CorrelationID: 88bc2dda-ba2a-42dc-a9da-7b9f362f7d7a TimeCreated : 13/05/2020 11:57:03 Id : 1118 Message : Enterprise STS Logon failure. Status: 0xC000006D Correlation ID: FE6DBC4F-69BB-426B-933B-0BADB38A1361 TimeCreated : 13/05/2020 11:57:03 Id : 1081 Message : OAuth response error: invalid_grant Error description: MSIS9683: Received invalid OAuth JWT Bearer request. Transport key for the device is invalid. It must be a RSA public key blob or TPM storage key blob. CorrelationID: TimeCreated : 13/05/2020 11:57:03 Id : 1025 Message : Http request status: 400. Method: POST Endpoint Uri: https://adfs.domain.com/adfs/oauth2/token/ Correlation ID: FE6DBC4F-69BB-426B-933B-0BADB38A1361 TimeCreated : 13/05/2020 11:56:01 Id : 1082 Message : Key error: DecodingProtectedCredentialKeyFatalFailure Error description: AADSTS9002313: Invalid request. Request is malformed or invalid. Trace ID: 4a2197fa-c85f-4ea0-af79-1a830e1d2d00 Correlation ID: f6141ebb-116c-4701-9118-80124017b6d1 Timestamp: 2020-05-13 22:56:02Z CorrelationID: f6141ebb-116c-4701-9118-80124017b6d1 TimeCreated : 13/05/2020 11:56:01 Id : 1118 Message : Enterprise STS Logon failure. Status: 0xC000006D Correlation ID: E5C246DD-9FFF-4E07-92A5-61389B08C64A TimeCreated : 13/05/2020 11:56:01 Id : 1081 Message : OAuth response error: invalid_grant Error description: MSIS9683: Received invalid OAuth JWT Bearer request. Transport key for the device is invalid. It must be a RSA public key blob or TPM storage key blob. CorrelationID: TimeCreated : 13/05/2020 11:56:01 Id : 1025 Message : Http request status: 400. Method: POST Endpoint Uri: https://adfs.domain.com/adfs/oauth2/token/ Correlation ID: E5C246DD-9FFF-4E07-92A5-61389B08C64A ####################################### Microsoft-Windows-HelloForBusiness/Operational TimeCreated : 13/05/2020 11:57:00 Id : 5520 Message : Device unlock policy is not configured on this device. TimeCreated : 13/05/2020 11:56:03 Id : 7054 Message : Windows Hello for Business prerequisites check failed. Error: 0x1 TimeCreated : 13/05/2020 11:56:03 Id : 8205 Message : Windows Hello for Business successfully located a usable sign-on certificate template. TimeCreated : 13/05/2020 11:56:03 Id : 8206 Message : Windows Hello for Business successfully located a certificate registration authority. TimeCreated : 13/05/2020 11:56:03 Id : 7211 Message : The Secondary Account Primary Refresh Token prerequisite check failed. TimeCreated : 13/05/2020 11:56:03 Id : 8202 Message : The device meets Windows Hello for Business hardware requirements. TimeCreated : 13/05/2020 11:56:03 Id : 8204 Message : Windows Hello for Business post-logon provisioning is enabled. TimeCreated : 13/05/2020 11:56:03 Id : 8203 Message : Windows Hello for Business is enabled. TimeCreated : 13/05/2020 11:56:03 Id : 5204 Message : Windows Hello for Business certificate enrollment configurations: Certificate Enrollment Method: RA Certificate Required for On-Premise Auth: true TimeCreated : 13/05/2020 11:56:03 Id : 8200 Message : The device registration prerequisite check completed successfully. TimeCreated : 13/05/2020 11:56:03 Id : 8201 Message : The Primary Account Primary Refresh Token prerequisite check completed successfully. TimeCreated : 13/05/2020 11:56:03 Id : 8210 Message : Windows Hello for Business successfully completed the remote desktop prerequisite check. TimeCreated : 13/05/2020 11:56:03 Id : 3054 Message : Windows Hello for Business prerequisites check started. TimeCreated : 13/05/2020 11:56:00 Id : 8025 Message : The Microsoft Passport Container service started successfully. TimeCreated : 13/05/2020 11:56:00 Id : 8025 Message : The Microsoft Passport service started successfully. TimeCreated : 13/05/2020 11:55:07 Id : 5520 Message : Device unlock policy is not configured on this device. ####################################### Microsoft-Windows-User Device Registration/Admin TimeCreated : 13/05/2020 11:56:59 Id : 331 Message : Automatic device join pre-check tasks completed. Debug output:\r\n preCheckResult: DoNotJoin deviceKeysHealthy: YES isJoined: YES isDcAvailable: YES isSystem: YES keyProvider: Microsoft Platform Crypto Provider keyContainer: c9bc09fb-e9bd-4de7-b06a-f8798e6f377c dsrInstance: AzureDrs elapsedSeconds: 0 resultCode: 0x1 TimeCreated : 13/05/2020 11:56:59 Id : 335 Message : Automatic device join pre-check tasks completed. The device is already joined. TimeCreated : 13/05/2020 11:56:03 Id : 360 Message : Windows Hello for Business provisioning will not be launched. Device is AAD joined ( AADJ or DJ++ ): Yes User has logged on with AAD credentials: Yes Windows Hello for Business policy is enabled: Yes Windows Hello for Business post-logon provisioning is enabled: Yes Local computer meets Windows hello for business hardware requirements: Yes User is not connected to the machine via Remote Desktop: Yes User certificate for on premise auth policy is enabled: Yes Machine is governed by enrollment authority policy. See https://go.microsoft.com/fwlink/?linkid=832647 for more details. TimeCreated : 13/05/2020 11:56:03 Id : 362 Message : Windows Hello for Business provisioning will not be launched. Device is AAD joined ( AADJ or DJ++ ): Yes User has logged on with AAD credentials: Yes Windows Hello for Business policy is enabled: Yes Windows Hello for Business post-logon provisioning is enabled: Yes Local computer meets Windows hello for business hardware requirements: Yes User is not connected to the machine via Remote Desktop: Yes User certificate for on premise auth policy is enabled: Yes Enterprise user logon certificate enrollment endpoint is ready: Yes Enterprise user logon certificate template is : Yes User has successfully authenticated to the enterprise STS: No Certificate enrollment method: enrollment authority See https://go.microsoft.com/fwlink/?linkid=832647 for more details. TimeCreated : 13/05/2020 11:55:09 Id : 331 Message : Automatic device join pre-check tasks completed. Debug output:\r\n preCheckResult: DoNotJoin deviceKeysHealthy: YES isJoined: YES isDcAvailable: YES isSystem: YES keyProvider: Microsoft Platform Crypto Provider keyContainer: c9bc09fb-e9bd-4de7-b06a-f8798e6f377c dsrInstance: AzureDrs elapsedSeconds: 1 resultCode: 0x1 TimeCreated : 13/05/2020 11:55:09 Id : 335 Message : Automatic device join pre-check tasks completed. The device is already joined. TimeCreated : 13/05/2020 11:55:05 Id : 369 Message : The Workstation Service logged a device registration message. Message: AutoJoinSvc/WJComputeWorkplaceJoinTaskState: Machine is already joined to Azure AD. TimeCreated : 13/05/2020 11:55:05 Id : 369 Message : The Workstation Service logged a device registration message. Message: AutoJoinSvc/WJSetScheduledTaskState: Running task "\Microsoft\Windows\Workplace Join\Automatic-Device-Join". TimeCreated : 13/05/2020 11:55:05 Id : 369 Message : The Workstation Service logged a device registration message. Message: AutoJoinSvc/WJComputeWorkplaceJoinTaskState: Global policy found with value 1. |
Having an issue enabling internode TLS support in rabbitmq / erlang Posted: 10 Oct 2021 10:18 PM PDT We are running rabbit v3.8.3-1.el7, erlang v23.3.3.el7, kernel 3.10.0-1062.12.1.el7.x86_64, release Centos 7.7 I have three nodes that I would like in disc mode, cdvlhbqr23[1-3] However I'm running into an issue after attempting to enable TLS on erlang. [ cdvlhbqr231:rabbitmq ] 10.128.3.231 :: root -> rabbitmqctl status Error: unable to perform an operation on node 'rabbit@cdvlhbqr231'. Please see diagnostics information and suggestions below. Most common reasons for this are: * Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues) * CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server) * Target node is not running In addition to the diagnostics info below: * See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more * Consult server logs on node rabbit@cdvlhbqr231 * If target node is configured to use long node names, don't forget to use --longnames with CLI tools DIAGNOSTICS =========== attempted to contact: [rabbit@cdvlhbqr231] rabbit@cdvlhbqr231: * connected to epmd (port 4369) on cdvlhbqr231 * epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic * TCP connection succeeded but Erlang distribution failed * suggestion: check if the Erlang cookie identical for all server nodes and CLI tools * suggestion: check if all server nodes and CLI tools use consistent hostnames when addressing each other * suggestion: check if inter-node connections may be configured to use TLS. If so, all nodes and CLI tools must do that * suggestion: see the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more Current node details: * node name: 'rabbitmqcli-23412-rabbit@cdvlhbqr231' * effective user's home directory: /var/lib/rabbitmq/ * Erlang cookie hash: MudCW7tn3FA5sTmC1FlR0g== I've double checked the cookie file and it's identical across all nodes. All of the hostnames are correct and consistent across the nodes. So I assume this has to be directly the result of ssl / tls Here's what the node config looks like: [ cdvlhbqr231:rabbitmq ] 10.128.3.231 :: root -> cat /etc/rabbitmq/rabbitmq.config [ {rabbit, [ {vm_memory_high_watermark, 0.4}, {vm_memory_high_watermark_paging_ratio, 0.5}, {memory_alarms, true}, {disk_free_limit, 41686528}, {cluster_partition_handling, autoheal}, {tcp_listen_options, [binary, {packet, raw}, {reuseaddr, true}, {backlog, 128}, {nodelay, true}, {exit_on_close, false}, {keepalive, true} ] }, {cluster_nodes, {['rabbit@cdvlhbqr231', 'rabbit@cdvlhbqr232', 'rabbit@cdvlhbqr233'], disc}}, {loopback_users, []}, {tcp_listeners, [{"0.0.0.0",5672}]}, {ssl_listeners, [{"0.0.0.0",5671}]}, {ssl_options, [ {cacertfile,"/etc/pki/tls/certs/ca-bundle.crt"}, {certfile,"/etc/rabbitmq/ssl/cert.pem"}, {keyfile,"/etc/rabbitmq/ssl/key.pem"}, {verify,verify_peer}, {versions, ['tlsv1.2']}, {fail_if_no_peer_cert,false} ]} ] }, {rabbitmq_management, [{ listener, [ {port, 15672}, {ip, "0.0.0.0"}, {ssl, true}, {ssl_opts, [ {cacertfile,"/etc/pki/tls/certs/ca-bundle.crt"}, {certfile,"/etc/rabbitmq/ssl/cert.pem"}, {keyfile,"/etc/rabbitmq/ssl/key.pem"}, {verify,verify_peer}, {versions, ['tlsv1.2']} ]} ]} ] } ]. The private key is generated on the host and signed by an intermediate CA whose pub key is in the systems extracted cert bundle. We generate an "/etc/rabbitmq/ssl/allfile.pem" which is a bundle of the servers private key and signed cert. The ssl environment for erlang is defined as the following: [ cdvlhbqr231:rabbitmq ] 10.128.3.231 :: root -> cat rabbitmq-env.conf # Obtaining of an Erlang ssl library path export HOME=/var/lib/rabbitmq/ ERL_SSL_PATH=/usr/lib64/erlang/lib/ssl-9.6.2/ebin # Add SSL-related environment vars for rabbitmq-server and rabbitmqctl SERVER_ADDITIONAL_ERL_ARGS="-pa $ERL_SSL_PATH \ -proto_dist inet_tls \ -ssl_dist_opt server_certfile '/etc/rabbitmq/ssl/allfile.pem' \ -ssl_dist_opt server_secure_renegotiate true client_secure_renegotiate true" # CLI CTL_ERL_ARGS="-pa $ERL_SSL_PATH \ -proto_dist inet_tls \ -ssl_dist_opt server_certfile /etc/rabbitmq/ssl/allfile.pem \ -ssl_dist_opt server_secure_renegotiate true client_secure_renegotiate true" I'm not entirely clear what's causing the issue. I thought I had followed the documentation to a T. Can anyone help me review this and see if there is anything obvious I'm missing, or any suggestions on how to trace down this problem? |
systemctl limits - solr complaining Posted: 10 Oct 2021 10:08 PM PDT I'm building a solr server (on Ubuntu 18.04, using the repo solr-common and solr-jetty). On startup, solr was reporting that nfile and nproc (1024, 6721 resp) was set too low. I ran systemctl edit solr and created an override as follows: [Service] LimitNOFILE=65000 LimitNPROC=65000 I then restarted the service - solr still reporting the same issue. I added /etc/security/limits.d/solr containing: solr hard nofile 65535 solr soft nofile 65535 solr hard nproc 65535 solr soft nproc 65535 It is still reporting the same issue after restarting the service: # systemctl status solr ● solr.service - LSB: Controls Apache Solr as a Service Loaded: loaded (/etc/init.d/solr; generated) Drop-In: /etc/systemd/system/solr.service.d └─override.conf Active: active (exited) since Mon 2020-03-30 14:55:49 BST; 6s ago Docs: man:systemd-sysv-generator(8) Process: 6848 ExecStop=/etc/init.d/solr stop (code=exited, status=0/SUCCESS) Process: 6973 ExecStart=/etc/init.d/solr start (code=exited, status=0/SUCCESS) Mar 30 14:55:43 dev-a01-si-solr.bip solr[6973]: *** [WARN] *** Your open file limit is currently 1024. Mar 30 14:55:43 dev-a01-si-solr.bip solr[6973]: It should be set to 65000 to avoid operational disruption. Mar 30 14:55:43 dev-a01-si-solr.bip solr[6973]: If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh Mar 30 14:55:43 dev-a01-si-solr.bip solr[6973]: *** [WARN] *** Your Max Processes Limit is currently 6721. Mar 30 14:55:43 dev-a01-si-solr.bip solr[6973]: It should be set to 65000 to avoid operational disruption. Mar 30 14:55:43 dev-a01-si-solr.bip solr[6973]: If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh Mar 30 14:55:49 dev-a01-si-solr.bip solr[6973]: [194B blob data] Mar 30 14:55:49 dev-a01-si-solr.bip solr[6973]: Started Solr server on port 8983 (pid=7045). Happy searching! Mar 30 14:55:49 dev-a01-si-solr.bip solr[6973]: [14B blob data] Mar 30 14:55:49 dev-a01-si-solr.bip systemd[1]: Started LSB: Controls Apache Solr as a Service. What am I doing wrong here? update After amending /etc/systemd/system.conf to contain... DefaultLimitNOFILE=65000 DefaultLimitNPROC=65000 Solr is no longer complaining about the file limit but still complaining about the process limit. WTF Pottering? Drop-In: /etc/systemd/system/solr.service.d └─override.conf Active: active (exited) since Mon 2020-03-30 15:21:59 BST; 14s ago Docs: man:systemd-sysv-generator(8) Process: 1141 ExecStart=/etc/init.d/solr start (code=exited, status=0/SUCCESS) Mar 30 15:21:51 dev-a01-si-solr.bip solr[1141]: If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh Mar 30 15:21:51 dev-a01-si-solr.bip solr[1141]: *** [WARN] *** Your Max Processes Limit is currently 6721. Mar 30 15:21:51 dev-a01-si-solr.bip solr[1141]: It should be set to 65000 to avoid operational disruption. Mar 30 15:21:51 dev-a01-si-solr.bip solr[1141]: If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh Mar 30 15:21:51 dev-a01-si-solr.bip solr[1141]: Warning: Available entropy is low. As a result, use of the UUIDField, SSL, or any other features that require Mar 30 15:21:51 dev-a01-si-solr.bip solr[1141]: RNG might not work properly. To check for the amount of available entropy, use 'cat /proc/sys/kernel/random/entropy_avail'. Mar 30 15:21:59 dev-a01-si-solr.bip solr[1141]: [230B blob data] Mar 30 15:21:59 dev-a01-si-solr.bip solr[1141]: Started Solr server on port 8983 (pid=1459). Happy searching! Mar 30 15:21:59 dev-a01-si-solr.bip solr[1141]: [14B blob data] Mar 30 15:21:59 dev-a01-si-solr.bip systemd[1]: Started LSB: Controls Apache Solr as a Service. Amending user.conf to match did not help. Update 2 Well, this just keeps getting better and better. The disappearance of the nfile warning came after a reboot of the host. When I subsequently run systemctl restart solr I get this: Mar 30 15:39:21 dev-a01-si-solr.bip solr[2503]: *** [WARN] *** Your open file limit is currently 1024. Mar 30 15:39:21 dev-a01-si-solr.bip solr[2503]: It should be set to 65000 to avoid operational disruption. FFS! Now, where did I put that Centos 5 CD? Update 3 It turns out that this was no longer the packaged solr. Unbenknownst to me, someone had problems getting the original build to work and found a tutorial on the internet on how to install from tarball. So I now have a system with half tarball/half repo solr which we can't patch / upgrade. |
How to fix broken packages on Ubuntu Posted: 10 Oct 2021 08:29 PM PDT I have a Ubuntu 18.10 server, and recently tried to update git . I keep getting errors that a number of packages are not properly installed. Errors were encountered while processing: libpaper1:amd64 libpaper-utils unattended-upgrades libgs9:amd64 ghostscript Then I ran dpkg --configure -a and see the same errors. I want to be careful and not hose my system, but how can I fix these errors? ~ $ sudo apt list --upgradable Listing... Done ~ $ sudo apt-get check Reading package lists... Done Building dependency tree Reading state information... Done ~ $ sudo apt-get check Reading package lists... Done Building dependency tree Reading state information... Done ~ $ sudo apt-get upgrade Reading package lists... Done Building dependency tree Reading state information... Done Calculating upgrade... Done 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. 5 not fully installed or removed. After this operation, 0 B of additional disk space will be used. Do you want to continue? [Y/n] y Setting up libpaper1:amd64 (1.1.24+nmu5ubuntu1) ... dpkg: error processing package libpaper1:amd64 (--configure): installed libpaper1:amd64 package post-installation script subprocess returned error exit status 10 dpkg: dependency problems prevent configuration of libpaper-utils: libpaper-utils depends on libpaper1; however: Package libpaper1:amd64 is not configured yet. dpkg: error processing package libpaper-utils (--configure): dependency problems - leaving unconfigured Setting up unattended-upgrades (1.5ubuntu3.18.10.4) ... dpkg: error processing package unattended-upgrades (--configure): installed unattended-upgrades package post-installation script subprocess returned error exit status 10 dpkg: dependency problems prevent configuration of libgs9:amd64: libgs9:amd64 depends on libpaper1; however: Package libpaper1:amd64 is not configured yet. dpkg: error processing package libgs9:amd64 (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of ghostscript: ghostscript depends on libgs9 (= 9.26~dfsg+0-0ubuntu0.18.10.9); however: Package libgs9:amd64 is not configured yet. dpkg: error processing package ghostscript (--configure): dependency problems - leaving unconfigured Processing triggers for libc-bin (2.28-0ubuntu1) ... Errors were encountered while processing: libpaper1:amd64 libpaper-utils unattended-upgrades libgs9:amd64 ghostscript E: Sub-process /usr/bin/dpkg returned an error code (1) EDIT In response to @Stefan Skoglund's question: ~ $ sudo apt-cache policy libpaper1 libpaper1: Installed: 1.1.24+nmu5ubuntu1 Candidate: 1.1.24+nmu5ubuntu1 Version table: *** 1.1.24+nmu5ubuntu1 500 500 http://mirror.hetzner.de/ubuntu/packages cosmic/main amd64 Packages 500 http://de.archive.ubuntu.com/ubuntu cosmic/main amd64 Packages 100 /var/lib/dpkg/status ~ $ sudo dpkg-reconfigure -plow libpaper1 /usr/sbin/dpkg-reconfigure: libpaper1 is broken or not fully installed EDIT 2 Throwing caution to the wind, I closed my eyes, crossed by fingers and tried this: sudo apt-get --purge remove libpaper1:amd64 libpaper-utils unattended-upgrades libgs9:amd64 ghostscript sudo apt-get clean sudo apt-get update && sudo apt-get upgrade sudo apt autoremove It magically worked. Bounty is still available to someone who could explain what happened here and what the best practice / troubleshooting hints would be. |
MS SQL Port Forwarding with IP Tables Posted: 10 Oct 2021 08:01 PM PDT I have 2 remote servers, one Linux box and one Windows Box. The Windows box is running MS SQL Server. It's behind a firewall, and I can only access it from my Linux box (I can netcat to the Windows box on port 1433 so that's working ok). I'd like to use the linux box as a proxy so I can connect to the MS-SQL Server from my desktop. I've tried setting up IP tables on the linux box with the following config, but my desktop still won't connect. #!/bin/sh echo 1 > /proc/sys/net/ipv4/ip_forward iptables -F iptables -t nat -F iptables -X iptables -t nat -A PREROUTING -p tcp --dport 1433 -j DNAT --to-destination xxx.xxx.xxx.xxx:1433 Any help would be greatly appreciated! |
NRPE: Unable to read output - Permissions issues Posted: 10 Oct 2021 06:03 PM PDT I am trying to be as clear as possible: my brain is going to explode, like those explodding kittens. Both machines Centos 7: [root@192.168.10.2]# cat /proc/version Linux version 3.10.0-693.11.6.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Thu Jan 4 01:06:37 UTC 2018 And latest NRPE from EPEL: [root@192.168.10.1]# ./check_nrpe -H 192.168.10.2 NRPE v3.2.0 I am trying to restart a service from nagios server, so I can set an event handler. All started with a lots of scripts, but now I shrink the problem to this: [root@192.168.10.1]# ./check_nrpe -H 192.168.10.2 -c restart NRPE: Unable to read output [root@192.168.10.1]# ./check_nrpe -H 192.168.10.2 -c status (... correct service status output ...) Loaded: loaded (/usr/lib/systemd/system/cachefilesd.service (... correct service status output ...) So, I can status services, but cannot start or restart. [root@192.168.10.2]# cat /etc/nagios/nrpe.conf: [...] nrpe_user=nrpe nrpe_group=nrpe allowed_hosts=127.0.0.1,192.168.10.1 command[status]=/lib64/nagios/plugins/status.sh command[restart]=/lib64/nagios/plugins/restart.sh [...] [root@192.168.10.2]# cat /lib64/nagios/plugins/status.sh #!/bin/bash sudo systemctl status cachefilesd exit 0 and [root@192.168.10.2]# cat /lib64/nagios/plugins/restart.sh #!/bin/bash sudo systemctl restart cachefilesd exit 0 sudoers: [root@192.168.10.2]# cat /etc/sudoers # Defaults specification Defaults: nrpe !requiretty Defaults: nagios !requiretty nagios ALL = NOPASSWD: /sbin/service,/usr/bin/systemctl,/usr/sbin/service nrpe ALL = NOPASSWD: /sbin/service,/usr/bin/systemctl,/usr/sbin/service If I type: [root@192.168.10.2]# sudo -u nrpe -H ./restart-cachefilesd.sh All is fine. I enabled debug in NRPE, and I get: nrpe[5431]: Host address is in allowed_hosts nrpe[5431]: Host 192.168.10.1 is asking for command 'restart' to be run... nrpe[5431]: Running command: /lib64/nagios/plugins/restart.sh nrpe[5432]: WARNING: my_system() seteuid(0): Operation not permitted nrpe[5431]: Command completed with return code 0 and output: nrpe[5431]: Return Code: 3, Output: NRPE: Unable to read output nrpe[5431]: Connection from 192.168.10.1 closed. I tried to strace the output, but is too much for me... |
LAPS Find-AdmPwdExtendedRights : Object not found Posted: 10 Oct 2021 09:02 PM PDT I am setting up LAPS at the moment and want to use the standard "Computers" Organisational Unit. I am working through the setup guide but I keep getting this error: PS C:\Users\Administrator.DOMAIN> Find-AdmPwdExtendedRights -OrgUnit "Comp uters" | Format-Table Find-AdmPwdExtendedRights : Object not found At line:1 char:1 + Find-AdmPwdExtendedRights -OrgUnit "Computers" | Format-Table + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : NotSpecified: (:) [Find-AdmPwdExtendedRights], N otFoundException + FullyQualifiedErrorId : AdmPwd.PSTypes.NotFoundException,AdmPwd.PS.FindE xtendedRights I get similar error message about Object not found when I try using the Set-AdmPwdComputerSelfPermission commandlet etc. |
Certutil -revoke Posted: 10 Oct 2021 07:03 PM PDT i am trying to use certutil to manage my CA. Is there a possible way to user Certutil -revoke "RequestID=?" I only see it for the SerialNumber of the certificate wich is not really handsome. Envy |
Can OpenSSL be used to debug an SSL connection to a MySQL server? Posted: 10 Oct 2021 09:25 PM PDT I want my webserver to speak to the MySQL database server over an SSL connection. The Webserver runs CentOS5, the Database Server runs FreeBSD. The certificates are provided by a intermediate CA DigiCert. MySQL should be using ssl, according to my.cnf : # The MySQL server [mysqld] port = 3306 socket = /tmp/mysql.sock ssl ssl-capath = /opt/mysql/pki/CA ssl-cert = /opt/mysql/pki/server-cert.pem ssl-key = /opt/mysql/pki/server-key.pem When I start MySQL, the daemon starts without errors. This suggests that the certificate files are all readable. But when I try to connect from the webserver to the database server, I get an error: [root@webserver ~]# mysql -h mysql.example.org -u user -p ERROR 2026 (HY000): SSL connection error And if I try to debug further with openssl: [root@webserver ~]# openssl s_client -connect mysql.example.org:3306 0>/dev/null CONNECTED(00000003) 15706:error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol:s23_clnt.c:588: Is this a valid way to test the SSL connection to a MySQL database server? The SSL23_GET_SERVER_HELLO:unknown protocol message is strange since this typically what you would see if you were speaking SSL on a port intended for non-SSL traffic. This same openssl command seems to work fine with LDAP & HTTP servers: $ openssl s_client -connect ldap.example.org:636 0>/dev/null CONNECTED(00000003) depth=2 /C=US/O=The Go Daddy Group, Inc./OU=Go Daddy Class 2 Certification Authority ... $ openssl s_client -connect www.example.org:443 0>/dev/null CONNECTED(00000003) depth=0 /DC=org/DC=example/OU=Services/CN=www.example.org |
How to install APC on a vagrant box running Ubuntu with PHP? Posted: 10 Oct 2021 09:02 PM PDT In my "Vagrant" file I have this line: chef.add_recipe("php::module_apc") But it gives me this error: [2013-01-11T22:14:53+00:00] INFO: Processing package[php-apc] action install (php::module_apc line 34) ================================================================================ Error executing action `install` on resource 'package[php-apc]' ================================================================================ Chef::Exceptions::Exec ---------------------- apt-get -q -y install php-apc=3.1.7-1 returned 100, expected 0 Resource Declaration: --------------------- # In /tmp/vagrant-chef-1/chef-solo-1/cookbooks/php/recipes/module_apc.rb 33: when "debian" 34: package "php-apc" do 35: action :install 36: end 37: end Compiled Resource: ------------------ # Declared in /tmp/vagrant-chef-1/chef-solo-1/cookbooks/php/recipes/module_apc.rb:34:in `from_file' package("php-apc") do retry_delay 2 retries 0 recipe_name "module_apc" action [:install] cookbook_name :php package_name "php-apc" end [2013-01-11T22:14:53+00:00] ERROR: Running exception handlers [2013-01-11T22:14:53+00:00] ERROR: Exception handlers complete [2013-01-11T22:14:53+00:00] FATAL: Stacktrace dumped to /tmp/vagrant-chef-1/chef-stacktrace.out [2013-01-11T22:14:53+00:00] FATAL: Chef::Exceptions::Exec: package[php-apc] (php::module_apc line 34) had an error: Chef::Exceptions::Exec: apt-get -q -y install php-apc=3.1.7-1 returned 100, expected 0 Chef never successfully completed! Any errors should be visible in the output above. Please fix your recipes so that they properly complete. I'm also running this before: chef.add_recipe("apt") But it's no help either. Any ideas how to fix this? Thanks a lot! Btw, I'm using all cookbooks from OpsCode: https://github.com/opscode-cookbooks/ |
Network behavior of slow Windows shared storage Posted: 10 Oct 2021 06:03 PM PDT I have got on my hands 3 Windows XP file servers (their sole purpose is their SMB share) running on a office with about 50 users. The workload is only office usage: they use it to store and share among them Access databases and XLS files, and they use the files over the network share. It is almost instantaneous to copy a 700 Kb XLS file from one of the servers to a workstation, but it takes over a minute to load it from a remote share with Excel. This same file is loaded in a few seconds if from a local disk. I don't know what makes the access to the file so slow when using it via the network, I suspect it is some quirk of Windows remote file access (maybe authentication?), and I hope to be possible to change some simple flag on the servers to speed things up to a sane speed. I have taken screen shots of the network usage while loading the aforementioned XLS file, can you recognize this pattern and possibly give me some clue of what is the problem? In the first image, there are two runs of Excel loading the remote file, both taking more than one minute to complete. The top and bottom graphs are from the same thing, but I only found later the task manager option to discriminate between upload (red) and download (yellow), so I took 2 different screenshots (concatenated bellow). Both runs took more than one minute, possibly more than 2 minutes. In the second image there are the 3rd and 4th runs. This time they ran considerably faster than the first one, but still too slow for bearable use. Both took more than 1 minute, but in the 4th run it occurred to me to measure the time properly, and I found it to take 1 minute and 42 seconds. That was the fastest of them. This time I only took one screenshot, of the discriminated version. What I noticed in all runs is the initial peak, about 8 seconds after I start the run, then the network usage drops to very low usage, then, a few seconds later, there is another peak, the biggest concentrated activity, then a long time of almost no activity, when finally Excel shows the file. There is still another peak that begins when the file is shown, and lasts for a few seconds. The offset between the start and end of the run, and the activity in the graph seems to be caused by a delay in the task manager to show the data. I don't know when the file is actually downloaded. I can't also explain why the green graph shows a small activity between peaks, and the red/yellow graph shows none. But the most intriguing of all is the minute long pause between the second and third peak, when I have no idea on what is happening, and certainly could be much faster. Can someone experienced in Windows networks provide some expert guess on what is the problem with this setup (apart from decade old operating system)? Do you recognize these graph patterns? Can explain it? Have any hint on how to improve performance? |
LVS / IPVS difference in ActiveConn since upgrading Posted: 10 Oct 2021 10:08 PM PDT I've recently migrated from an old version of LVS / ldirectord (Ultra Monkey) to a new Debian install with ldirectord. Now the amount of Active Connections is usually higher than the amount of Inactive Connections, it used to be the other way around. Basically on the old load balancer the connections looked something like: -> RemoteAddress:Port Forward Weight ActiveConn InActConn -> 10.84.32.21:0 Masq 1 12 252 -> 10.84.32.22:0 Masq 1 18 368 However since migrating it to the new load balancer it looks more like: -> RemoteAddress:Port Forward Weight ActiveConn InActConn -> 10.84.32.21:0 Masq 1 313 141 -> 10.84.32.22:0 Masq 1 276 183 Old load balancer: - Debian 3.1
- ipvsadm 1.24
- ldirectord 1.2.3
New load balancer: - Debian 6.0.5
- ipvsadm 1.25
- ldirectord 1.0.3 (I guess the versioning system changed)
Is it because the old load balancer was running a kernel from 2005, and ldirectord from 2004, and things have simply changed in the past 7 - 8 years? Did I miss some sysctl settings that I should be enforcing for it to behave in the same way? Everything appears to be working fine but can anyone see an issue with this behaviour? Thanks in advance! Additional info: I'm using LVS in masquerading mode, the real servers have the load balancer as their gateway. The real servers are running Apache, which hasn't changed during the upgrade. The boxes themselves show roughly the same amount of Inactive Connections shown in ipvsadm. |
Replace a Cisco VPN IPSec concentrator with an Ubuntu-box Posted: 10 Oct 2021 08:01 PM PDT Is it possible to replace a Cisco VPN IPSec concentrator with Ubuntu and for instance Strongswan? 1) Do Strongswan implement the same protocolls that Cisco uses? 2) Can we retrieve keys from the Cisco concentrator and import them to the Ubuntu-box, if not can we generate new keys that suits equipment at sites? 3) Are there any performance concerns to think of? |
No comments:
Post a Comment