Recent Questions - Server Fault

IIS 10 ARR FARM Only Hits Second Server
Nginx how to disable per ip rate limiting
How to minimize chances of one's domain being falsely blacklisted (uribl)
AWS Route 53 Failover DNS with Healthcheck not updating IP (even though health check shows failure)
How reliable is the "host" in an incoming HTTPS request?
Tuning Linux router and server for better performance / solving single TCP connection slow speed
How to understand kernel crash with vmcore-dmesg.txt and kexec-dmesg.log
2 domains configured on digital ocean, only 1 works with direct browser access using Apache2
Restore files from software RAID 1
Kerberos: ticket with no REALM after principal name (i.e. `principal@`)
Linux - Set-up a client-server network from scratch
how to simply monitor (or tail) several log files from several remote machines
Best practice for adding contract companies as AAD guests
deprecated 'sshKeys' metadata item | centos | latest guest environment already installed
Carbon neutral data center?
Mail Queue getting huge even though mails are sent but slow and sending without spamassassin does not work
Configuring a PHP Web Service Container at Build Time
AWS Unified CloudWatch Agent and Beanstalk
502 Bad Gateway nginx/1.10.3 error with NodeJS (Debian 9)
How to manually setup network connection from Busybox shell (ash)?
IPSec strongswan "established successfully", but no ppp0
How to verify signature on a file using OpenSSL with custom engine
KVM - Access from an external computer to the Virtual Machine
nginx proxy_pass is being ignored
Remote access Mikrotik with no public IP
SLES 11 SP3 with Bootable Driver Kit - unable to fetch image error
How to get access to Icinga on Apache 2.4?
add space to virtual disk on vmware
Outlook Cannot Establish Connection To Exchange while using Citrix VPN
TPROXY iptables and l7 filter

Posted: 13 Jun 2021 10:39 PM PDT

Trying to learn IIS farming on Server 2016 - IIS 10,i'm able to configure farm setup but my ARR only get hits from second server all the time.

Here re my configuration details; Main Server Windows Server 2016 Standart - 192.168.2.15 - IIS 10 - website name is servistest, it only contains one page as index.asp;

<!DOCTYPE html>  <head>    <meta name="description" content="Webpage description goes here" />    <title>Web Server 001</title>    <meta name="viewport" content="width=device-width, initial-scale=1">    <meta name="author" content="">  </head>  <body>  <%    Response.Write "<font color='red' size='35px'><b><center>"+FormatDateTime(date,format)+" "+FormatDateTime(time,format)+"<br>WEBSERVER 001</font></b></center>"  %>    </body>  </html>

Second Server ; Windows Server 2016 Standart - 192.168.2.16 - IIS 10 - website name is servistest, it contains the same asp page as index.asp

<!DOCTYPE html>  <head>    <meta name="description" content="Webpage description goes here" />    <title>Web Server 002</title>    <meta name="viewport" content="width=device-width, initial-scale=1">    <meta name="author" content="">  </head>  <body>  <%    Response.Write "<font color='BLUE' size='35px'><b><center>"+FormatDateTime(date,format)+" "+FormatDateTime(time,format)+"<br>WEBSERVER 002</font></b></center>"  %>    </body>  </html>

Third Server ; Windows Server 2016 Standart - 192.168.2.17 - IIS 10 - website name is servistest, it contains the same asp page as index.asp

<!DOCTYPE html>  <head>    <meta name="description" content="Webpage description goes here" />    <title>Web Server 003</title>    <meta name="viewport" content="width=device-width, initial-scale=1">    <meta name="author" content="">  </head>  <body>  <%    Response.Write "<font color='GREEN' size='35px'><b><center>"+FormatDateTime(date,format)+" "+FormatDateTime(time,format)+"<br>WEBSERVER 003</font></b></center>"  %>    </body>  </html>

Here are main server settings; FARM SETTINGS URL RE WRITE SETTINGS LOAD BALANCE SETTINGS BROWSER OUTPUT

After these settings when i call 192.168.2.15 it hits to 192.168.2.16/index.asp and it's only show this page BROWSER OUTPUT it never shows other two pages from two servers.

Refreshed page with shift+F5 multiple times, cleared browser and server's cache, no matter what i do it only shows page on Web Server 002/192.168.2.16 and never hits to main server/192.168.2.15 or third server/192.168.2.17.

On the almost all how to documents on the web, they are using domain instead of LAN IP addresses, is that what i am doing wrong? I'm working on local network that's why should i edit the hosts files of the servers and clients to work with domains? Does ARR requires at least 3 servers(main server for farm configuration +2 servers for balance) to work properly?

Nginx how to disable per ip rate limiting

Posted: 13 Jun 2021 09:15 PM PDT

I have a API which connect through private ip of the EC2 server and execute sequence of callbacks. I want disable per ip rate limiting on this scenario. I have tried this method in Nginx documentation.

This does not solved rate limit for issue. Access Log

192.168.192.51 - - [14/Jun/2021:00:09:55 +0530] "POST /project/api/v1/vendor/callback HTTP/1.1" 429 8576 "-" "Java/1.8.0_151" "-" "192.168.13.173" sn="192.168.13.173" rt=0.009 ua="unix:/var/run/php/php7.4-fpm.sock" us="429" ut="0.008" ul="8591" cs=-

Nginx conf file

user www-data;  worker_processes auto;  pid /run/nginx.pid;  include /etc/nginx/modules-enabled/*.conf;    events {      worker_connections 1024;      # multi_accept on;  }    http {        geo $limit {          default 1;          192.168.192.51 0;      }         map $limit $limit_key {              0 "";              1 $binary_remote_addr;      }         limit_req_zone $limit_key zone=req_zone:10m rate=100r/s;        ##      # Basic Settings      ##        sendfile on;      tcp_nopush on;      tcp_nodelay on;      keepalive_timeout 65;      types_hash_max_size 2048;      # server_tokens off;        # server_names_hash_bucket_size 64;      # server_name_in_redirect off;        include /etc/nginx/mime.types;      default_type application/octet-stream;        ##      # SSL Settings      ##        ssl_protocols TLSv1 TLSv1.1 TLSv1.2 TLSv1.3; # Dropping SSLv3, ref: POODLE      ssl_prefer_server_ciphers on;        ##      # Logging Settings      ##        error_log /var/log/nginx/error.log warn;      log_format main_ext '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for" ' '"$host" sn="$server_name" ' 'rt=$request_time ' 'ua="$upstream_addr" us="$upstream_status" ' 'ut="$upstream_response_time" ul="$upstream_response_length" ' 'cs=$upstream_cache_status' ;      access_log /var/log/nginx/access.log main_ext;        ##      # Gzip Settings      ##        gzip on;        # gzip_vary on;      # gzip_proxied any;      # gzip_comp_level 6;      # gzip_buffers 16 8k;      # gzip_http_version 1.1;      # gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;        ##      # Virtual Host Configs      ##        include /etc/nginx/conf.d/*.conf;      include /etc/nginx/sites-enabled/*;        fastcgi_buffers 8 16k;          fastcgi_buffer_size 32k;          fastcgi_connect_timeout 90;          fastcgi_send_timeout 90;          fastcgi_read_timeout 90;  }

Server Block

server {      listen 80;      listen 81;            root /data/www;            index index.html index.htm index.php;            server_name 192.168.13.173;              location / {                  try_files $uri $uri/ /index.php$is_args$args;          }        location /project{                  alias /data/www/project/public;                  try_files $uri $uri/ @project;                    location ~ \.php$ {                          include snippets/fastcgi-php.conf;                          fastcgi_param SCRIPT_FILENAME $request_filename;                          fastcgi_pass unix:/var/run/php/php7.4-fpm.sock;                  }          }            location @project {                  rewrite /project/(.*)$ /project/index.php?/$1 last;          }          location ~ \.php$ {                  include snippets/fastcgi-php.conf;                  fastcgi_pass unix:/var/run/php/php7.4-fpm.sock;          }            location ~ /\.ht {                  deny all;          }      }

How to minimize chances of one's domain being falsely blacklisted (uribl)

Posted: 13 Jun 2021 08:29 PM PDT

My side job is to admin my wife's company domain. It's only used as a domain name for Google mail and tools (slides, docs, etc...).

Although she used her domain for email for many years without any issues, we apparently made the mistake of leaving the web part with a parked website. Last week, a Fortune 500 company, our main customer, changed their spam filter and her email are getting blocked. Email is required to submit business proposal.

http://multirbl.valli.org/lookup/ lists every list as green for our domain, except uribl. uribl shows us on the multi and black lists.

I've put a request to delist and the very fine folks at http://uribl.com answered with:

Status: Rejected

Reason: what kind of business has a parked web page? - will expire

No spam has ever been sent from this domain. It doesn't have any webserver to be compromised, no file transfer. It only ever sent Google email from her account, to a handful of customers, and was not compromised.

My question is: What would a decent sysadmin do in such a situation? I tried putting a real website (it's up now), and re-asked to be delisted, but I've got a feeling the kind of person that would refuse delisting the first time round with a pedantic and rhetoric question will not be moved.

Is the ability to deliver a real business email from one's legit domain really in the hands of some random person managing a minor spam blacklist? (I know it sounds like a rant, I'm trying to keep my emotion in check, but this is probably a major risk to half a year salary)

Below is a redacted snapshot of multirbl.valli.org showing the details I have:

AWS Route 53 Failover DNS with Healthcheck not updating IP (even though health check shows failure)

Posted: 13 Jun 2021 10:35 PM PDT

Our goal is to have a Healthcheck continuously evaluate the health of an endpoint. When it becomes unhealthy, we want the DNS to failover to a different IP address. We have set this up but we now realized that it actually doesn't work (i.e. when the Healthcheck goes red, no failover happens). Here is our current configuration:

A Record

Record name: www.mydomain.com
Record type: A
TTL: 30 seconds
Routing policy: Failover
Failover record type: Primary
Healthcheck: www
Record ID: www-1
Value:

A Record

Record name: www.mydomain.com
Record type: A
TTL: 30 seconds
Routing policy: Failover
Failover record type: Secondary
Healthcheck: www
Record ID: www-1
Value:

In addition, we have a health check.

OK - so we recently had an issue, where the healthcheck turned red. We got notified via SNS as expected. However, when doing an NSLookup of www.mydomain.com it was still returning the value for the Primary. We fixed the issue within less than 5 minutes.

Given the TTL and so on configured above, shouldn't we have seen the NSLookup update to show the Secondary? Is it possible it would take longer to failover? If so, why?

Is there an error of some kind in the configuration above? If so any guidance would be greatly appreciated.

How reliable is the "host" in an incoming HTTPS request?

Posted: 13 Jun 2021 05:37 PM PDT

I'm trying to understand what level of confidence I can have when my API which lives at api.foo.com receives a POST request from a page that has foo.com specified as its host value in the header.

Specifically - is this something that can be faked (maybe even is somehow easy to fake?) or is it difficult (impossible?) for someone to send something to api.foo.com from some entirely alternate location and spoof in the header that the host is foo.com?

If it's not difficult or impossible then what's the industry standard mechanism for verifying that the request is coming from a trusted place?

Tuning Linux router and server for better performance / solving single TCP connection slow speed

Posted: 13 Jun 2021 04:55 PM PDT

I have a simplest/common network architecture.

Web server sits behind router on local network. This router does iptables DNAT so port forwarding is achieved to web server.

Therefore, I'm able to download file from server 1 to my computer over the internet.

My questions

What is the proper kernel tuning to ensure that router is using most of its potential (for around 2000 connections and highest throughput)? I have an issue in ORANGE
Do kernel parameters look fine on Server 1?
Can you explain why I've got just 3mbps from Server 1 while CPU and RAM are not overloaded? So can you see other issues apart Linux kernel, CPU and RAM? Could you list these possible issues to explore? 1gbps network interfaces, ports, etc? 2x1.5ghz ARM is slow for routing? iptables version?

OS and resources

Computer - Mac OS 8 x86 CPU cores, 16G/32G of free RAM

Router - Linux DD-WRT 2 ARM CPU cores, 270M/512M of free RAM

Server 1 - Linux Ubuntu 18.04 4 x86 CPU cores, 240M/32G of free RAM (500M swapped to SSD)

Server 2 - Linux Raspbian 1 ARM CPU core, 95M/512M of free RAM

MTU

Everywhere 1500

TXQUEUELEN

Everywhere 1000

Protocols

UDP speeds are fine

TCP speed is affected, any port

Iptables version

Router - 1.3.7

Server 1 - 1.8.4

Server 2 - 1.6.0

Linux versions

Router - 4.9.207

Server 1 - 5.4.0-67-generic

Server 2 - 4.14.79+

Theoretical link speeds

From my computer to router - 30mbps / 3.75 MB/s

From router to web server 1 - 1gbps

From router to web server 2 - 1gbps

Download speeds from web server (file is hosted in RAM)

TEST 1: Server 2 -> Router = 800mbps

TEST 2: Server 2 -> Computer = 30mbps

TEST 3: Server 1 -> Router = 800mbps

TEST 4: Server 1 -> Computer using 15 connections = 15mbps

TEST 5: Server 1 -> Computer = 3mbps (the issue!)

CPU usage is at around few percents on any device. CPU load average is 0.0x on all devices, but Server 1 - it has 4.6 load average. Server 1 also handles around 500-1000 connections for other things outside of tests, but at around 1mbps so it shouldn't affect test throughput dramatically (unless these connections somehow making things worse indirectly).

Regardless that load is higher, TEST 3 performed very well. So it's still hard to blame Server 1.

There are no issues in dmesg on any device.

My thoughts

Issue appears only when DNAT'ing on router and only with Server 1 which has high amount of other connections (but these connections are almost idling so shouldn't affect everything badly?).

Most interesting test to describe in final thoughts

When I do multi-thread web download (TEST 4) Server 1 performs much better. So it's capable to reach higher download speeds. But why 1 connection can't reach same speed as multiple ones?

Parameters that I explored

Can you see something that is not well optimised for Linux router?

net.core.wmem_max - maximum tcp socket send buffer memory size (in bytes). Increase TCP read/write buffers to enable scaling to a larger window size. Larger windows increase the amount of data to be transferred before an acknowledgement (ACK) is required. This reduces overall latencies and results in increased throughput.

This setting is typically set to a very conservative value of 262,144 bytes. It is recommended this value be set as large as the kernel allows. The value used in here was 4,136,960 bytes. However, 4.x kernels accept values over 16MB.

Router - 180224

Server 1 - 212992

Server 2 - 163840

Somewhere else used - 83886080

net.core.wmem_default

Router - 180224

Server 1 - 212992

Server 2 - 163840

Somewhere else used - 83886080

net.ipv4.rmem_max - maximum tcp socket receive buffer memory size (in bytes)

Router - 180224

Server 1 - 212992

Server 2 - 163840

Somewhere else used - 335544320

net.core.rmem_default

Router - 180224

Server 1 - 212992

Server 2 - 163840

Somewhere else used - 335544320

net.ipv4.tcp_rmem - Contains three values that represent the minimum, default and maximum size of the TCP socket receive buffer. The recommendation is to use the maximum value of 16M bytes or higher (kernel level dependent) especially for 10 Gigabit adapters.

Router - 4096 87380 3776288

Server 1 - 4096 131072 6291456

Server 2 - 4096 87380 3515840

Somewhere else used - 4096 87380 4136960 (IBM)

net.ipv4.tcp_wmem - Similar to the net.ipv4.tcp_rmem this parameter consists of 3 values, a minimum, default, and maximum. The recommendation is to use the maximum value of 16M bytes or higher (kernel level dependent) especially for 10 Gigabit adapters.

Router - 4096 16384 3776288

Server 1 - 4096 16384 4194304

Server 2 - 4096 16384 3515840

Somewhere else used - 4096 87380 4136960 (IBM)

net.ipv4.tcp_tw_reuse - In high traffic environments, sockets are created and destroyed at very high rates. This parameter, when set, allows no longer needed and about to be destroyed sockets to be used for new connections. When enabled, this parameter can bypass the allocation and initialization overhead normally associated with socket creation saving CPU cycles, system load and time.

The default value is 0 (off). The recommended value is 1 (on).

Router - 0

Server 1 - 2

Server 2 - 0

Somewhere else used - 1

net.ipv4.tcp_tw_reuse

Router - 0

Server 1 - 2

Server 2 - 0

Somewhere else used - 1

net.ipv4.tcp_max_tw_buckets - Specifies the maximum number of sockets in the "time-wait" state allowed to exist at any time. If the maximum value is exceeded, sockets in the "time-wait" state are immediately destroyed and a warning is displayed. This setting exists to thwart certain types of Denial of Service attacks. Care should be exercised before lowering this value. When changed, its value should be increased, especially when more memory has been added to the system or when the network demands are high and environment is less exposed to external threats.

Router - 2048

Server 1 - 131072

Server 2 - 2048

Somewhere else used - 65536, 262144 (IBM), 45000 (IBM)

net.ipv4.tcp_tw_reuse

Router - 0

Server 1 - 2

Server 2 - 0

Somewhere else used - 1

net.ipv4.tcp_fin_timeout

Router - 60

Server 1 - 60

Server 2 - 60

Somewhere else used - 15

net.ipv4.tcp_max_syn_backlog

Router - 128

Server 1 - 2048

Server 2 - 128

Somewhere else used - 65536

net.ipv4.ip_local_port_range - range of ports used for outgoing TCP connections (useful to change it if you have a lot of outgoing connections from host)

Router - 32768 60999

Server 1 - 32768 60999

Server 2 - 32768 60999

Somewhere else used - 1024 65535

net.core.netdev_max_backlog - number of slots in the receiver's ring buffer for arriving packets (kernel put packets in this queue if the CPU is not available to process them, for example by application)

Router - 120

Server 1 - 1000

Server 2 - 1000

Somewhere else used - 100000, 1000 (IBM), 25000 (IBM)

net.ipv4.neigh.default.gc_thresh1

Router - 1

Server 1 - 128

Server 2 - 128

Somewhere else used - 128

net.ipv4.neigh.default.gc_thresh2

Router - 512

Server 1 - 512

Server 2 - 512

Somewhere else used - 512

net.ipv4.neigh.default.gc_thresh3

Router - 1024

Server 1 - 1024

Server 2 - 1024

Somewhere else used - 1024

net.ipv4.neigh.default.gc_thresh3

Router - 1024

Server 1 - 1024

Server 2 - 1024

Somewhere else used - 1024

net.core.somaxconn - maximum listen queue size for sockets (useful and often overlooked setting for loadbalancers, webservers and application servers (like unicorn, php-fpm). If all server processes/threads are busy, then incoming client connections are put in "backlog" waiting for being served). Full backlog causes client connections to be immediately rejected, causing client error.

Router - 128

Server 1 - 4096

Server 2 - 128

net.ipv4.tcp_mem - TCP buffer memory usage thresholds for autotuning, in memory pages (1 page = 4kb)

Router - 5529 7375 11058

Server 1 - 381144 508193 762288

Server 2 - 5148 6866 10296

net.nf_conntrack_max - maximum number of connections

Router - 32768

Server 1 - 262144

Server 2 - no information

net.netfilter.nf_conntrack_max - maximum number of connections? If this is correct parameter, then 1560 is not enough

Router - 1560

Server 1 - 262144

Server 2 - no information

/proc/sys/net/ipv4/tcp_congestion_control - Network congestion in data networking [...] is the reduced quality of service that occurs when a network node is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking of new connections. Networks use congestion control and congestion avoidance techniques to try to avoid congestion collapse.1

Router - westwood

Server 1 - cubic

Server 2 - cubic

net.ipv4.tcp_syn_retries - Specifies how many times to try to retransmit the initial SYN packet for an active TCP connection attempt. The current setting is 20, which means that there are 20 retransmission attempts before the connection times out. This can take several minutes, depending on the length of the retransmission attempt.

Router - 6

Server 1 - 6

Server 2 - 6

net.ipv4.tcp_low_latency - The default value is 0 (off). For workloads or environments where latency is a higher priority, the recommended value is 1 (on).

Router - 0

Server 1 - 0

Server 2 - 0

net.ipv4.tcp_limit_output_bytes - Using this parameter, TCP controls small queue limits on per TCP socket basis. TCP tends to increase the data in-flight until loss notifications are received. With aspects of TCP send auto-tuning, large amounts of data might get queued at the device on the local machine, which can adversely impact the latency for other streams. tcp_limit_output_bytes limits the number of bytes on a device to reduce the latency effects caused by a larger queue size.

Router - 262144

Server 1 - 1048576

Server 2 - 262144

Somewhere else used - 262,144 (IBM), 131,072 (IBM)

How to understand kernel crash with vmcore-dmesg.txt and kexec-dmesg.log

Posted: 13 Jun 2021 04:48 PM PDT

I have a server running CentOS 8, the kernel crashed someday and I found the found the following three files in /var/crash: vmcore, vmcore-dmesg.txt, and kexec-dmesg.log.

I first looked at vmcore-dmesg.txt, which gives me the following info at the end

[291071.552140] {2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1  [291071.552141] {2}[Hardware Error]: event severity: fatal  [291071.552141] {2}[Hardware Error]:  Error 0, type: fatal  [291071.552142] {2}[Hardware Error]:   section_type: PCIe error  [291071.552142] {2}[Hardware Error]:   port_type: 4, root port  [291071.552142] {2}[Hardware Error]:   version: 3.0  [291071.552143] {2}[Hardware Error]:   command: 0x0547, status: 0x4010  [291071.552143] {2}[Hardware Error]:   device_id: 0000:16:01.0  [291071.552143] {2}[Hardware Error]:   slot: 82  [291071.552144] {2}[Hardware Error]:   secondary_bus: 0x18  [291071.552144] {2}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x2031  [291071.552145] {2}[Hardware Error]:   class_code: 000406  [291071.552145] {2}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0013  [291071.552145] {2}[Hardware Error]:   aer_uncor_status: 0x00000020, aer_uncor_mask: 0x00100000  [291071.552146] {2}[Hardware Error]:   aer_uncor_severity: 0x00062030  [291071.552146] {2}[Hardware Error]:   TLP Header: 00000000 00000000 00000000 00000000  [291071.552146] Kernel panic - not syncing: Fatal hardware error!  [291071.552147] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Not tainted 4.18.0-305.3.1.el8.x86_64 #1  [291071.552147] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./EPC621D8A, BIOS P2.10 04/03/2019  [291071.552148] Call Trace:  [291071.552148]  <NMI>  [291071.552148]  dump_stack+0x5c/0x80  [291071.552149]  panic+0xe7/0x2a9  [291071.552149]  __ghes_panic.cold.32+0x21/0x21  [291071.552149]  ghes_notify_nmi+0x273/0x310  [291071.552149]  nmi_handle+0x63/0x110  [291071.552150]  default_do_nmi+0x49/0x100  [291071.552150]  do_nmi+0x17e/0x1e0  [291071.552150]  end_repeat_nmi+0x16/0x6f  [291071.552151] RIP: 0010:intel_idle+0x6b/0xb0  [291071.552151] Code: 40 5c 01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 75 19 e9 07 00 00 00 0f 00 2d 1e 01 55 00 c1 ee 18 b9 01 00 00 00 89 f0 0f 01 c9 <65> 48 8b 04 25 40 5c 01 00 f0 80 60 02 df f0 83 44 24 fc 00 48 8b  [291071.552152] RSP: 0018:ffffffff8fe03e40 EFLAGS: 00000002  [291071.552152] RAX: 0000000000000020 RBX: ffffffff8ff30ba8 RCX: 0000000000000001  [291071.552153] RDX: 0000000000000000 RSI: 0000000000000020 RDI: 0000000000000003  [291071.552153] RBP: ffff9e4a20835ad8 R08: 0000000000000002 R09: 0000000000029700  [291071.552154] R10: 0002cd7f37820a74 R11: ffff9e4a20828be4 R12: ffffffff8ff30a40  [291071.552154] R13: 0000000000000003 R14: 0000000000000003 R15: 0000000000000003  [291071.552154]  ? intel_idle+0x6b/0xb0  [291071.552154]  ? intel_idle+0x6b/0xb0  [291071.552155]  </NMI>  [291071.552155]  cpuidle_enter_state+0x87/0x3c0  [291071.552155]  cpuidle_enter+0x2c/0x40  [291071.552156]  do_idle+0x234/0x260  [291071.552156]  cpu_startup_entry+0x6f/0x80  [291071.552156]  start_kernel+0x518/0x538  [291071.552157]  secondary_startup_64_no_verify+0xc2/0xcb

Using lspci, I can find 0000:16.01.0 is 16:01.0 PCI bridge: Intel Corporation Sky Lake-E PCI Express Root Port B (rev 02), which seems to be the PCI-E root. and

lspci -s 16:01.0 -tvv  0000:16:01.0-[18-1b]----00.0-[19-1b]----03.0-[1a-1b]--+-00.0  Intel Corporation Ethernet Connection X722 for 1GbE                                                        +-00.1  Intel Corporation Ethernet Connection X722 for 1GbE                                                        +-00.2  Intel Corporation Ethernet Connection X722 for 1GbE                                                        \-00.3  Intel Corporation Ethernet Connection X722 for 1GbE

Then I looked at the kexec-dmesg.log file, which says

[Thu Jun 10 20:02:45 2021] Memory manager not clean during takedown.  [Thu Jun 10 20:02:45 2021] WARNING: CPU: 0 PID: 399 at drivers/gpu/drm/drm_mm.c:999 drm_mm_takedown+0x1f/0x30 [drm]  [Thu Jun 10 20:02:45 2021] Modules linked in: amdgpu(+) sd_mod t10_pi sg iommu_v2 gpu_sched i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops crc32c_intel drm ahci libahci uas libata usb_storage dm_mirror dm_region_hash dm_log dm_mod fuse overlay squashfs loop  [Thu Jun 10 20:02:45 2021] CPU: 0 PID: 399 Comm: systemd-udevd Tainted: G        W        --------- -  - 4.18.0-305.3.1.el8.x86_64 #1  [Thu Jun 10 20:02:45 2021] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./EPC621D8A, BIOS P2.10 04/03/2019  [Thu Jun 10 20:02:45 2021] RIP: 0010:drm_mm_takedown+0x1f/0x30 [drm]  [Thu Jun 10 20:02:45 2021] Code: f6 c3 48 8d 41 c0 eb bb 0f 1f 00 0f 1f 44 00 00 48 8b 47 38 48 83 c7 38 48 39 c7 75 01 c3 48 c7 c7 58 57 1b c0 e8 da b6 f6 c0 <0f> 0b c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00  [Thu Jun 10 20:02:45 2021] RSP: 0018:ffffc90000747a10 EFLAGS: 00010282  [Thu Jun 10 20:02:45 2021] RAX: 0000000000000000 RBX: ffff88805d44caf0 RCX: ffffffff8265f1c8  [Thu Jun 10 20:02:45 2021] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 0000000000000246  [Thu Jun 10 20:02:45 2021] RBP: ffff888050e65030 R08: 00000000000005e6 R09: 0000000000aaaaaa  [Thu Jun 10 20:02:45 2021] R10: 0000000000000000 R11: ffffc900009e0320 R12: ffff88805d44ca00  [Thu Jun 10 20:02:45 2021] R13: ffff888050e64f68 R14: 0000000000000000 R15: 0000000000000000  [Thu Jun 10 20:02:45 2021] FS:  00007f16a3901180(0000) GS:ffff88805ea00000(0000) knlGS:0000000000000000  [Thu Jun 10 20:02:45 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033  [Thu Jun 10 20:02:45 2021] CR2: 0000564d0235b008 CR3: 000000005d5b6002 CR4: 00000000007706b0  [Thu Jun 10 20:02:45 2021] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000  [Thu Jun 10 20:02:45 2021] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400  [Thu Jun 10 20:02:45 2021] PKRU: 55555554  [Thu Jun 10 20:02:45 2021] Call Trace:  [Thu Jun 10 20:02:45 2021]  amdgpu_gtt_mgr_fini+0x2d/0x80 [amdgpu]  [Thu Jun 10 20:02:45 2021]  ttm_bo_clean_mm+0xa8/0xc0 [ttm]  [Thu Jun 10 20:02:45 2021]  amdgpu_ttm_fini+0x98/0xe0 [amdgpu]  [Thu Jun 10 20:02:45 2021]  amdgpu_bo_fini+0xe/0x30 [amdgpu]  [Thu Jun 10 20:02:45 2021]  gmc_v9_0_sw_fini+0x59/0xa0 [amdgpu]  [Thu Jun 10 20:02:45 2021]  amdgpu_device_fini+0x297/0x4af [amdgpu]  [Thu Jun 10 20:02:45 2021]  amdgpu_driver_unload_kms+0x3e/0x70 [amdgpu]  [Thu Jun 10 20:02:45 2021]  amdgpu_driver_load_kms+0x122/0x2a0 [amdgpu]  [Thu Jun 10 20:02:45 2021]  amdgpu_pci_probe+0xd1/0x150 [amdgpu]  [Thu Jun 10 20:02:45 2021]  local_pci_probe+0x41/0x90  [Thu Jun 10 20:02:45 2021]  pci_device_probe+0x105/0x1c0  [Thu Jun 10 20:02:45 2021]  really_probe+0x255/0x4a0  [Thu Jun 10 20:02:45 2021]  driver_probe_device+0x49/0xc0  [Thu Jun 10 20:02:45 2021]  device_driver_attach+0x50/0x60  [Thu Jun 10 20:02:45 2021]  __driver_attach+0x61/0x130  [Thu Jun 10 20:02:45 2021]  ? device_driver_attach+0x60/0x60  [Thu Jun 10 20:02:45 2021]  bus_for_each_dev+0x77/0xc0  [Thu Jun 10 20:02:45 2021]  ? klist_add_tail+0x3b/0x70  [Thu Jun 10 20:02:45 2021]  bus_add_driver+0x14d/0x1e0  [Thu Jun 10 20:02:45 2021]  ? 0xffffffffc07d3000  [Thu Jun 10 20:02:45 2021]  driver_register+0x6b/0xb0  [Thu Jun 10 20:02:45 2021]  ? 0xffffffffc07d3000  [Thu Jun 10 20:02:45 2021]  do_one_initcall+0x46/0x1c3  [Thu Jun 10 20:02:45 2021]  ? do_init_module+0x22/0x220  [Thu Jun 10 20:02:45 2021]  ? kmem_cache_alloc_trace+0x131/0x270  [Thu Jun 10 20:02:45 2021]  do_init_module+0x5a/0x220  [Thu Jun 10 20:02:45 2021]  load_module+0x14c5/0x17f0  [Thu Jun 10 20:02:45 2021]  ? __switch_to_asm+0x35/0x70  [Thu Jun 10 20:02:45 2021]  ? __switch_to_asm+0x41/0x70  [Thu Jun 10 20:02:45 2021]  ? __switch_to_asm+0x35/0x70  [Thu Jun 10 20:02:45 2021]  ? __switch_to_asm+0x41/0x70  [Thu Jun 10 20:02:45 2021]  ? apic_timer_interrupt+0xa/0x20  [Thu Jun 10 20:02:45 2021]  ? __do_sys_init_module+0x13b/0x180  [Thu Jun 10 20:02:45 2021]  __do_sys_init_module+0x13b/0x180  [Thu Jun 10 20:02:45 2021]  do_syscall_64+0x5b/0x1a0  [Thu Jun 10 20:02:45 2021]  entry_SYSCALL_64_after_hwframe+0x65/0xca  [Thu Jun 10 20:02:45 2021] RIP: 0033:0x7f16a24df80e  [Thu Jun 10 20:02:45 2021] Code: 48 8b 0d 7d 16 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4a 16 2c 00 f7 d8 64 89 01 48  [Thu Jun 10 20:02:45 2021] RSP: 002b:00007ffc5a383dd8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af  [Thu Jun 10 20:02:45 2021] RAX: ffffffffffffffda RBX: 0000558aa33c7ee0 RCX: 00007f16a24df80e  [Thu Jun 10 20:02:45 2021] RDX: 0000558aa33c85e0 RSI: 00000000009621ec RDI: 0000558aa3def1a0  [Thu Jun 10 20:02:45 2021] RBP: 0000558aa33c85e0 R08: 0000558aa33c301a R09: 0000000000000003  [Thu Jun 10 20:02:45 2021] R10: 0000558aa33c3010 R11: 0000000000000246 R12: 0000558aa3def1a0  [Thu Jun 10 20:02:45 2021] R13: 0000558aa33dabf0 R14: 0000000000020000 R15: 0000000000000000  [Thu Jun 10 20:02:45 2021] ---[ end trace 0950097d77ca3e03 ]---

Which seems to me related to GPU driver.

To my understanding, when kernel crashes, kdump tries to boot another kernel using kexec to dump the crashed kernel. Then the log seems to me like some PCI-E hardware error happens makes the main kernel crash, and when the kdump kernel starts, it crashed again because of GPU driver error. Am I understanding this correctly? Or the logs showed in kexec-dmesg.log is actually the stack trace of the main kernel?

My second question is then how to understand these error messages. As it seems only NIC is connected to the PCI-E root, is there something wrong with my motherboard/CPU, or the problem is likely on the kernel?

A side information, I found in /var/log that the following error often happens which does not crash the kernel

Jun  7 11:12:20 localhost kernel: {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0  Jun  7 11:12:20 localhost kernel: {1}[Hardware Error]: It has been corrected by h/w and requires no further action  Jun  7 11:12:20 localhost kernel: {1}[Hardware Error]: event severity: corrected  Jun  7 11:12:20 localhost kernel: {1}[Hardware Error]:  Error 0, type: corrected  Jun  7 11:12:20 localhost kernel: {1}[Hardware Error]:   section_type: PCIe error  Jun  7 11:12:20 localhost kernel: {1}[Hardware Error]:   port_type: 5, upstream switch port  Jun  7 11:12:20 localhost kernel: {1}[Hardware Error]:   version: 3.0  Jun  7 11:12:20 localhost kernel: {1}[Hardware Error]:   command: 0x0147, status: 0x0010  Jun  7 11:12:20 localhost kernel: {1}[Hardware Error]:   device_id: 0000:18:00.0  Jun  7 11:12:20 localhost kernel: {1}[Hardware Error]:   slot: 82  Jun  7 11:12:20 localhost kernel: {1}[Hardware Error]:   secondary_bus: 0x19  Jun  7 11:12:20 localhost kernel: {1}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x37c0  Jun  7 11:12:20 localhost kernel: {1}[Hardware Error]:   class_code: 000406  Jun  7 11:12:20 localhost kernel: {1}[Hardware Error]:   bridge: secondary_status: 0x2000, control: 0x0013  Jun  7 11:12:20 localhost kernel: pcieport 0000:18:00.0: aer_status: 0x00003000, aer_mask: 0x00002000  Jun  7 11:12:20 localhost kernel: pcieport 0000:18:00.0:    [12] Timeout                 Jun  7 11:12:20 localhost kernel: pcieport 0000:18:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID

where 18:00.0 is a PCI bridge 18:00.0 PCI bridge: Intel Corporation Device 37c0 (rev 09) and

 lspci -s 18:00.0 -tvv  0000:18:00.0-[19-1b]----03.0-[1a-1b]--+-00.0  Intel Corporation Ethernet Connection X722 for 1GbE                                        +-00.1  Intel Corporation Ethernet Connection X722 for 1GbE                                        +-00.2  Intel Corporation Ethernet Connection X722 for 1GbE                                        \-00.3  Intel Corporation Ethernet Connection X722 for 1GbE

Any help will be greatly appreciated.

2 domains configured on digital ocean, only 1 works with direct browser access using Apache2

Posted: 13 Jun 2021 03:57 PM PDT

I have a droplet on digital ocean which was initially configured with only one domain (andrey.dev.br), it worked right out of the box after installing either Apache or Nginx without any extra configuration.

after some time I configured a second domain (raphaelvieira.dev) on the same droplet via digital ocean admin panel, the first weird behavior was that every time I tried to access the domain raphaelvieira.dev on the browser (chrome, firefox, etc), the browser automatically redirected it to https://raphaelvieira.dev, which is odd because I don't have a HTTPS configured on apache, but it works if I access it via terminal with curl, example:

curl htttp://raphaelvieira.dev

after some unsuccessful tries, I decided to add virtual hosts for the two domains, following this tutorial, the first one (andrey.dev.br) continued to work fine via browser access or curl, but the raphaelvieira.dev, when accessed from browser, started to return "www.raphaelvieira.dev took too long to respond.", but kept working via curl on terminal.

the raphaelviera.dev domain is registered on google domains.

why is this happening?

Restore files from software RAID 1

Posted: 13 Jun 2021 03:51 PM PDT

I accidentally erased my files from my MDADM raid when I created a Docker container and mapped my raid but after that all files in the raid disappeared. The discs are not currently being written or read. I want to ask what is the way I can recover my files? Unfortunately I don't have a backup. Also I am open to any suggestions even to attach disks to Windows machine. This is my Docker compose config file which I used --> https://pastebin.com/PqwEkZ4G

Thanks in advance.

Kerberos: ticket with no REALM after principal name (i.e. `principal@`)

Posted: 13 Jun 2021 02:57 PM PDT

When I run a klist after ssh-ing into a Kerberized instance, I obtain the TGS for the principal host/vmtest001, however, why do I get two of them including one with no REALM after the @ separator?

Here is the output of klist:

Ticket cache: FILE:/tmp/krb5cc_1000  Default principal: athena@EXAMPLE.COM    Valid starting     Expires            Service principal  06/13/21 21:05:00  06/14/21 07:05:00  krbtgt/EXAMPLE.COM@EXAMPLE.COM          renew until 06/14/21 21:04:59  06/13/21 21:05:03  06/14/21 07:05:00  host/vmtest001@          renew until 06/14/21 21:04:59  06/13/21 21:05:03  06/14/21 07:05:00  host/vmtest001@EXAMPLE.COM          renew until 06/14/21 21:04:59

Linux - Set-up a client-server network from scratch

Posted: 13 Jun 2021 02:50 PM PDT

I have been looking for this, but haven't been able to find anything "complete" out there, probably because it's such a standard setup that people think it's included in the human genome by now. I know they are a lot of things in one question, but they are all related.

Here's what I would like to do:

Have one (in a more involved scenario more than one) Linux server, where information about users is stored.
Have several clients (7-800, with a mix on Linux and Windows) from which users can work.
Users have different roles (e.g. students, teachers and staff). Depending on the role, they have different permissions and different programs they need to use (students don't need to use the accounting software).
Users can login on any computer in the network and always see their desktop the way they left it when they last logged out. All the programs they need will be there, with the possibility for the administrator to add, update or remove programs for any users group.
All users of a certain category (say teachers) should each have his/her own folder and a shared folder where they put things they are working on together.
Ideally, users (at least some of them) should be able to login from home over the internet and still see their desktop with their programs, just as if they were in the office.

Bonus features:

When a new user needs to be created (new students come at the beginning of the new year) their information (Name, Address, Birth date, etc.) can be imported from a csv file and the system will automatically give them an initial random password (which they will have to change on first access) and create a mail account for them (like firstname.lastname@beautifulschool.edu)
When the organization decides to change the 7-800 computers or to buy 2000 new ones, it should be possible to configure one of them and then "clone" the configuration on all others (if not at one time, at least in batches of some tens of them)

So the question is not necessarily how to do this. It would suffice to point to detailed information online. What I could find is not complete and very scattered around the net, so I can't put it back together to save my life.

how to simply monitor (or tail) several log files from several remote machines

Posted: 13 Jun 2021 10:34 PM PDT

A background:

I have a product that's installed on labs composed of several machine (some of them 3 and some 8 VMs) of Windows server 2016 and up and Windows 10.
The product creates several log files (*.log) with different name and purpose in each machine.
I think that those log files are created by the log4net feature...
There are 5 services to follow up: IIS, SQL, RabbitMQ, Product's service
I cannot UNC (like: \server-name\logs\product.log) to those machines.
Machines have no access to internet
Currently If I want to monitor a log file I need to RDP to each machine and run the following PowerShell script line: Get-Content C:\Product\Logs\Product.log -wait -tail 1000

The question:

Is there a free simple tool \ script that can connect to those secured lab (SSL, TLS) and show those log files in real time in my host? (I already review baretail, WinTail, I didn't understand if promatheus, grafana and other tools can be suitable to my needs...)
Is there a free simple tool \ script that can connect to those secured lab (SSL, TLS) and show those services event log in real time in my host?
How can I use the PowerShell script line to monitor the log files from all the Lab's machines? (already tried to Invoke-Command to several machines but my terminal became a jungle...)

Best practice for adding contract companies as AAD guests

Posted: 13 Jun 2021 05:47 PM PDT

Our small business has an Azure Active Directory (AAD) that is the central repository for employees and "guests" who have access to various digital resources. Recently, we've started hiring outside companies to do product development. For now, this involves adding a very small number of people from each company to our AAD to allow them access to our resources.

When we start doing business with a company and their employees, I add their employees to our AAD and manually edit their profiles to specify their company name. That way, when we stop doing business with that company, I can filter my member list by their company name and remove them.

My question is: How do people who know what they are doing deal with this sort of thing (the ability to remove all members of a group from AAD when you stop doing business with that group)?

deprecated 'sshKeys' metadata item | centos | latest guest environment already installed

Posted: 13 Jun 2021 07:36 PM PDT

When login into GCP VM from browser i get a popup saying

The VM guest environment is outdated and only supports the deprecated 'sshKeys' metadata item. Please follow the steps here to update.

A: I followed the instruction and updated the guest environment as mentioned here and run following cmds

sudo yum makecache  sudo yum install google-compute-engine google-compute-engine-oslogin \  google-guest-agent google-osconfig-agent

B: As a result, I now have the following packages

google-compute-engine-20210204.00-g1.el7.noarch   google-compute-engine-oslogin-20210429.00-g1.el7.x86_64  google-guest-agent-20210223.01-g1.el7.x86_64   google-osconfig-agent-20210429.3-g1.el7.x86_64

C: Restarted the VM and still getting the same msg of The VM guest environment is outdated

What can be the issue?.
Note: I am unable to use SSHMeta feature also, as I am trying SSHKey via meta for the first time on this vm. Also, this vm was created from an image which was more than two-year old

PS: Have validated the environment as mentioned here

>>>sudo systemctl list-unit-files | grep google | grep enabled  google-accounts-manager.service               enabled   google-address-manager.service                enabled   google-clock-sync-manager.service             enabled   google-guest-agent.service                    enabled   google-osconfig-agent.service                 enabled   google-shutdown-scripts.service               enabled   google-startup-scripts.service                enabled   google-oslogin-cache.timer                    enabled

serial port console also looks ok

Installed packages are

rpm -qa --queryformat '%{NAME}\n' \  > |grep -iE google\|gce | grep -iE \  > 'google|gce'  google-compute-daemon  google-compute-engine  google-cloud-sdk  google-compute-engine-oslogin  google-guest-agent  google-osconfig-agent

I do see logs that google-agent creates /home/user-configured-in-ssh-meta

But it does not add the key under authorizedkey file

Carbon neutral data center?

Posted: 13 Jun 2021 07:33 PM PDT

I stumbled upon https://www.climateneutraldatacentre.net/ after thinking about my companies carbon footprint.

I know this is a very complex topic, and not just as trivial as where the energy comes from for the data centres.. but i cannot find any data-centers that are claiming any carbon offsetting or neutrality.

Has anyone seen an environmentally friendly data-center?

Mail Queue getting huge even though mails are sent but slow and sending without spamassassin does not work

Posted: 13 Jun 2021 10:10 PM PDT

I have setup an Ubuntu 20.04 server to only use SMTP and send email from this server, but I am facing an issue that when spamassassin is stopped, postfix does not try to send anything, while when spamassassin is enabled I see mail queue is decreasing and postfix logs show me the sending.

Also when it's sending, send rate and speed is slow, about 1 to 10 per minute, while I did not set any limits and I checked /etc/postfix/main.conf and there was nothing to limit.

I see no errors in /var/log/mail.err and nothing happens in /var/log/mail.log when spamassassin is not enabled.

Would you please help me?

Configuring a PHP Web Service Container at Build Time

Posted: 13 Jun 2021 10:00 PM PDT

I am building a PHP Web Service Container Prototype.

The first and simplest way was to create a Container with the official Image published by WordPress
with a simple Dockerfile like:

FROM wordpress:latest

Which builds but than fails to run:

# docker run -it wordpress_local apache2-foreground  WordPress not found in /var/www/html - copying now...  Complete! WordPress has been successfully copied to /var/www/html  AH00534: apache2: Configuration error: No MPM loaded.

which is a Known Error unable to be fixed. So the Image is broken.
Other Images like php7-apache2 also produce the same Error.

Unable to find a prebuild Image that would actually run I started to build an Image from scratch. It contains

Alpine Linux 3.12
Apache 2.4
PHP 7.3

with the Dockerfile:

# cat Dockerfile  FROM alpine:3.12  RUN apk add apache2 php7 php7-apache2   ADD html/ /var/www/html/  WORKDIR /var/www/html/  CMD ["httpd", "-DNO_DETACH -DFOREGROUND -e debug"]

and a docker-compose.yml:

# cat docker-compose.yml  version: '3'  services:    web:      image: php_web_alpine      build: .      ports:       - "8081:8081"

This builds nicely:

# docker build -t php_web_alpine .  Sending build context to Docker daemon  7.68 kB  Step 1/5 : FROM alpine:3.12   ---> a24bb4013296  Step 2/5 : RUN apk add apache2 php7 php7-apache2   ---> Using cache   ---> bf59e0c43f1f  Step 3/5 : ADD html/ /var/www/html/   ---> 0fe4bfd871b2  Removing intermediate container cec9de242174  Step 4/5 : WORKDIR /var/www/html/   ---> 03d3fe0a077f  Removing intermediate container b1763eb3e56b  Step 5/5 : CMD httpd -DNO_DETACH -DFOREGROUND -e debug   ---> Running in 4ca69abc9f52   ---> e3a33ae6e028  Removing intermediate container 4ca69abc9f52  Successfully built e3a33ae6e028

But than does not run because of a simple Configuration Error:

# docker-compose up   Recreating phpalpine_web_1 ... done  Attaching to phpalpine_web_1  web_1  | AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.20.0.2. Set the 'ServerName' directive globally to suppress this message  phpalpine_web_1 exited with code 0

It needs just to fill the httpd.conf Configuration File with the correct values.

So, how can I fill the Web Service Configuration Files (Apache and PHP and others, etc ...) at Build Time to get a nice reproducible Build ?

AWS Unified CloudWatch Agent and Beanstalk

Posted: 13 Jun 2021 03:02 PM PDT

AWS offers a newly developed log collector and CloudWatch uploader, as described here: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/UseCloudWatchUnifiedAgent.html

My question is that how to make use of this new agent using Beanstalk? Currently we are using EBS solution: 64bit Amazon Linux 2018.03 v2.7.1 running Java 8 As this is EBS, I don't want to migrate our EC2 machines by hand, because they are regularly thrown away and being recreated by EBS (rendering manual migration pointless), so my understanding is that, I need a new solution/AMI version that has this unified agent instead of the old one. Our motivation is that the old version is buggy, and sometimes it misses to upload random chunks of rotated logs, leading to completely missing logs of 1 hours. This bug is verified by AWS, and instead of fixing this issue, they wrote a complete new application and left the old one without fixes.

This article refers to an "old" agent (awslogs), that is included in the mentioned AMI as built in. I tried to read AWS documentation, but I did not find any official AMI versions offering the new version.

I am really hoping that an official AMI offers this uploader, and if so, what is this AMI? Thanks!

ps.: I would not create a custom AMI for the purpose, as we want to stick to official Amazon Linux AMIs.

502 Bad Gateway nginx/1.10.3 error with NodeJS (Debian 9)

Posted: 13 Jun 2021 03:02 PM PDT

For a few days I have a server where I run successfully a few Web applications developed in NodeJS.

Everything worked fine until suddenly the browser started to show the error 502 Bad Gateway nginx/1.10.3 when trying to access to the website. I have not made any changes that could create this type of error.

Looking for information on the web, it seems that this error is related to the way Nginx directs the request to the port of my application. I have reviewed the configuration in my/etc/nginx/sites-available/default and everything seems correct. This is an excerpt from my configuration:

# --------------------------  # WEBSITE 1 - www.mywebsite.com  # --------------------------    server {      listen 80;        if ($host = mywebsite.com) {          return 301 https://www.mywebsite.com$request_uri;      }      if ($host = www.mywebsite.com) {          return 301 https://www.mywebsite.com$request_uri;      }        server_name www.mywebsite.com mywebsite.com;      location /{          proxy_pass "http://127.0.0.1:3000";      }  }    server {      listen 443 ssl;        if ($host = mywebsite.com) {          return 301 https://www.mywebsite.com$request_uri;      }        server_name www.mywebsite.com mywebsite.com;      location /{          proxy_pass "http://127.0.0.1:3000";      }        # LetsEncrypt Certificates      ssl_certificate /etc/letsencrypt/live/mywebsite.com/fullchain.pem; # managed by Certbot      ssl_certificate_key /etc/letsencrypt/live/mywebsite.com/privkey.pem; # managed by Certbot      include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot      ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot  }    # --------------------------  # WEBSITE 2 - www.mywebsite2.com  # --------------------------    server {      listen 80;        if ($host = mywebsite2.com) {          return 301 https://www.mywebsite2.com$request_uri;      }      if ($host = www.mywebsite2.com) {          return 301 https://www.mywebsite2.com$request_uri;      }        server_name www.mywebsite2.com mywebsite2.com;      location /{          proxy_pass "http://127.0.0.1:3001";      }  }    server {      listen 443 ssl;        if ($host = mywebsite2.com) {          return 301 https://www.mywebsite2.com$request_uri;      }        server_name www.mywebsite2.com mywebsite2.com;      location /{          proxy_pass "http://127.0.0.1:3001";      }        # [...] More LetsEncrypt Certificates, and more websites [...]    }

Also, I looked at the nginx error.log file and I can see that this line is written every time the website is accessed since this error happened:

[error] connect() failed (111: Connection refused) while connecting to upstream, client: XXX.XXX.XXX.XXX, server: www.mywebsite.com, request: "GET /aCustomUrl HTTP/1.1", upstream: "http://127.0.0.1:3000/aCustomUrl", host: "www.mywebsite.com"

Do you have any hint what can be happening? The config seems OK to me and it worked several days before successfully. I tried restarting the server, but it does not help...

Thanks all.

How to manually setup network connection from Busybox shell (ash)?

Posted: 13 Jun 2021 05:04 PM PDT

An embedded device running Linux version 2.6.26.5, ARM Linux Kernel. Busybox v1.10.2 shell (ash), I'm in Busybox shell. I want to set up connection between embedded device and computer. Is it possible manually set up network connection from Busybox shell? I mounted a main virtual file systems (proc, sysfs, tmpfs, /dev/pts), then entered commands to setup network, but without success. I guess, possibly, some modules or drivers were not loaded in this shell mode, but I'm not sure.

BusyBox v1.10.2 (2017-08-02 14:07:25 CST) built-in shell (ash)  Enter 'help' for a list of built-in commands.    /bin/sh: can't access tty; job control turned off  # mount -t proc proc /proc  # mount -t sysfs sysfs /sys  # mount -t tmpfs tmpfs /tmp  # mount -t tmpfs tmpfs /dev  # mkdir /dev/pts  # mount -t devpts devpts /dev/pts  # mdev -s  # ifconfig lo 127.0.0.1  # ifconfig eth0 hw ether 88:75:56:05:6D:28  # ifconfig eth0 192.168.15.1 netmask 255.255.255.0 broadcast 192.168.15.255  # ifconfig eth0 up  # route add -net 192.168.15.0/24 eth0  # ifconfig eth0  eth0      Link encap:Ethernet  HWaddr 88:75:56:05:6D:28              inet addr:192.168.15.1  Bcast:192.168.15.255  Mask:255.255.255.0            UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1            RX packets:0 errors:0 dropped:0 overruns:0 frame:0            TX packets:0 errors:0 dropped:0 overruns:0 carrier:0            collisions:0 txqueuelen:1000             RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)            Interrupt:22     # ping 192.168.15.100  PING 192.168.15.100 (192.168.15.100): 56 data bytes  From 192.168.15.100 icmp_seq=0 timed out

Edit: ifconfig eth0 output on Ubuntu computer:

$ ifconfig eth0  eth0      Link encap:Ethernet  HWaddr 20:47:47:49:bc:75              UP BROADCAST MULTICAST  MTU:1500  Metric:1            RX packets:0 errors:0 dropped:0 overruns:0 frame:0            TX packets:0 errors:0 dropped:0 overruns:0 carrier:0            collisions:0 txqueuelen:1000             RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

IPSec strongswan "established successfully", but no ppp0

Posted: 13 Jun 2021 06:06 PM PDT

I'm trying to connect an Ubuntu Server 16.04 to an IPSec L2TP VPN using the strongswan client.

Aparently the connection is established successfully, but the interface ppp0 isn't created.

This is the result of sudo ipsec up myconnection:

initiating Main Mode IKE_SA myconnection[2] to 116.38.129.101  generating ID_PROT request 0 [ SA V V V V ]  sending packet: from 192.168.0.104[500] to 116.38.129.101[500] (212 bytes)  received packet: from 116.38.129.101[500] to 192.168.0.104[500] (132 bytes)  parsed ID_PROT response 0 [ SA V V V ]  received NAT-T (RFC 3947) vendor ID  received XAuth vendor ID  received DPD vendor ID  generating ID_PROT request 0 [ KE No NAT-D NAT-D ]  sending packet: from 192.168.0.104[500] to 116.38.129.101[500] (244 bytes)  received packet: from 116.38.129.101[500] to 192.168.0.104[500] (236 bytes)  parsed ID_PROT response 0 [ KE No NAT-D NAT-D ]  local host is behind NAT, sending keep alives  generating ID_PROT request 0 [ ID HASH N(INITIAL_CONTACT) ]  sending packet: from 192.168.0.104[4500] to 116.38.129.101[4500] (100 bytes)  received packet: from 116.38.129.101[4500] to 192.168.0.104[4500] (68 bytes)  parsed ID_PROT response 0 [ ID HASH ]  IKE_SA myconnection[2] established between 192.168.0.104[192.168.0.104]...116.38.129.101[116.38.129.101]  scheduling reauthentication in 10033s  maximum IKE_SA lifetime 10573s  generating QUICK_MODE request 1590491286 [ HASH SA No ID ID NAT-OA NAT-OA ]  sending packet: from 192.168.0.104[4500] to 116.38.129.101[4500] (220 bytes)  received packet: from 116.38.129.101[4500] to 192.168.0.104[4500] (188 bytes)  parsed QUICK_MODE response 1590491286 [ HASH SA No ID ID NAT-OA NAT-OA ]  connection 'myconnection' established successfully

Any hint?.

How to verify signature on a file using OpenSSL with custom engine

Posted: 13 Jun 2021 07:02 PM PDT

Update Dec 28, 2017 – 3:

The author of OpenSSL DSTU module kindly provided patch to OpenSSL+DSTU implementation with a fix for the issue, and assisted further.

I was able to accomplish what I need first with this command:

./apps/openssl smime -verify -noverify -in my_message.txt.p7s -engine dstu -inform DER  engine "dstu" set.  Hello, world!  Verification successful

And later after concatenating a chain of certificates into a bundle.pem, I was able to do this:

./apps/openssl smime -verify -CAfile bundle.pem -in /yo/my_message.txt.p7s -engine dstu -inform DER  engine "dstu" set.  Hello, world!  Verification successful

Update Dec 28, 2017 – 2:

The author of OpenSSL DSTU module confirmed that the module is not working properly at the moment – https://github.com/dstucrypt/openssl-dstu/issues/2#issuecomment-354288000.

I guess I'll have to look elsewhere to find a proper DSTU4145 implementation. I've just learned about a BountyCastle project, and it's specification includes DSTU-4145. I guess there's no options left but to write some Java code to do perform signature verification.

Update Dec 28, 2017 – 1:

Here are my files:

the message to be verified: https://www.dropbox.com/s/pt7ms096lygz8es/my_message.txt.p7s?dl=0
the message to be verified, in ASN.1 format: https://gist.github.com/gmile/a9bb5cb57fc8195d74029251eb3946ba
certificate(s) I'm trying to verify with: https://acsk.privatbank.ua/arch/docs/PrivatBank.zip

I have a file, signed by someone with his private key: signed_content.txt. I also have a certificate from CA. The private key and certificate are somehow related to each other.

How do I verify the signature on a file?

This is what I'm doing:

Extract the public key from certificate (obtained from authority):

openssl x509 -pubkey -inform der -in PrivateCerts/CA-3004751DEF2C78AE010000000100000049000000.cer -noout -engine dstu > public_key.txt

Attempt to verify the contents of the file:

openssl rsautl -verify -in my_message.txt.p7s -inkey public_key.txt -pubin -engine dstu  engine "dstu" set.  openssl (lock_dbg_cb): already locked (mode=9, type=18) at md_rand.c:387  openssl (lock_dbg_cb): not locked (mode=10, type=18) at dstu_rbg.c:87  Error getting RSA key  139964169291424:error:0607907F:digital envelope routines:EVP_PKEY_get1_RSA:expecting an rsa key:p_lib.c:288:

Also, how do I extract the actual contents of the signed file?

Is the file I have is incorrect somehow? I can view it's ASN.1 contents:

openssl asn1parse -inform DER -in my_message.txt.p7s -i

The asn.1 structure seems to look OK (honestly, I know too little about ASN.1): I can see some fields about organization and stuff.

I'm using a DSTU engine (Ukrainian crypto standard), similar to GOST (Russian crypto standard).

KVM - Access from an external computer to the Virtual Machine

Posted: 13 Jun 2021 10:00 PM PDT

I have the following setup:

Notebook (IP: 192.168.1.100)  Host: (IP:192.168.1.129)

Both Notebook and Host are connected to a router (Internet IP:192.168.1.1)¨

The host (Host) has two virtual machine on it (Development, Office). The host since it uses a DHCP server(KVM), assigns the following IP addresses to the VM's

Development: 192.168.122.45  Office: 192.168.122.46

The DHCP server for the host hast the IP address 192.168.122.1

Now I like to access the Development VM from my Notebook (192.168.1.100) on port 5900 to remotely work on this VM.

I used some iptables roules to achieve this on the host, where the VM's are located:

iptables -t nat -I PREROUTING -p tcp -d 192.168.1.129 --dport 5900 -j DNAT --to-destination 192.168.122.45  iptables -I FORWARD -m state -d 192.168.122.0/24 --state NEW,RELATED,ESTABLISHED -j ACCEPT

Unfortunately I didn't get a connection with Spice to the Development VM.

 spice://192.168.1.129:5900

I edited my VM with

virsh edit VM-Development

and configured like this:

<graphics type='spice' port='5900' autoport='no' listen='127.0.0.1' keymap='de-ch'>     <listen type='address' address='127.0.0.1'/>

After I made the iptables roules the XML-configuration files contains a new entry:

<video>      <model type='qxiptables -t nat -I PREROUTING -p tcp -d 192.168.1.129 --dport 5900 -j DNAT --to-destination 192.168.122.45 iptables -I FORWARD -m state -d 192.168.122.0/24 --state NEW,RELATED,ESTABLISHED -j ACCEPT l' ram='65536' vram='65536' vgamem='16384' heads='1'/>          <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>  </video>

What's wrong? I used several hints, but I can't get it work. I also looked that the router has the port 5900 open.

nginx proxy_pass is being ignored

Posted: 13 Jun 2021 08:08 PM PDT

I have an Nginx server which works as a proxy server.

I also have 3 different NodeJS Express servers running on ports 8080, 9090 and 8888 (all working correctly).

Servers 8080 and 9090 execute the same APP. Server on 8888 currently should return 'POSTING' when the POST request has been forwarded.

Both GET and POST routes are set up correctly for server 8888 and I get a correct response if I call them directly with CURL requests:

import Express from 'express';  import BodyParser from 'body-parser';    // Initialise server  const server = Express();    // for parsing application/json  server.use(BodyParser.json());    // Setting port  server.set('port', 8888);    server.post('/', function(request, response) {        console.log('POSTING');      response.send('POSTING');  });    server.get('/', function(request, response) {        console.log('GETTING');      response.send('GETTING');  });    console.log(`Starting server on port ${server.get('port')}`)  server.listen(server.get('port'));

My Nginx config is the following:

upstream amit4got {        server 127.0.0.1:8080;      server 127.0.0.1:9090;  }    server {        listen       7070;      server_name  127.0.0.1;        access_log  /usr/local/etc/nginx/logs/default.access.log;      error_log  /usr/local/etc/nginx/logs/default.error.log;          location /amit4got/ {            proxy_set_header        Host $host;          proxy_set_header        X-Real-IP $remote_addr;          proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;          proxy_set_header        X-Forwarded-Proto $scheme;            proxy_pass              http://amit4got;          proxy_read_timeout      90;      }          location /amit4got/flows {            proxy_set_header        Host $host;          proxy_set_header        X-Real-IP $remote_addr;          proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;          proxy_set_header        X-Forwarded-Proto $scheme;            proxy_read_timeout      90;            if ($request_method = POST) {                proxy_pass          http://127.0.0.1:8888;          }            if ($request_method = GET) {                proxy_pass          http://amit4got;          }         }  }

Nginx serves on port 7070, and for the endpoint /amit4got loads the content from either 8080 or 9090.

The problem is that when I try to POST to /amit4got/flows, there is no POST request to http://127.0.0.1:8888. I just get a response of 404 not found.

If I change the proxy_pass to a rewrite, then I get a correct response from the 8888 server.

if ($request_method = POST) {        rewrite ^ http://127.0.0.1:8888;  }

I need to send through the POST params, so the rewrite does not work for me.

How can I make the proxy_pass to server 8888 work?

Thanks, Amit

Remote access Mikrotik with no public IP

Posted: 13 Jun 2021 05:04 PM PDT

I have a Mikrotik behind a broadband router, I want to access the Mikrotik using web or winbox but the problems are;

1- Mikrotik could access internet but has no public IP

2- Using VPN is forbidden here

3-Broadband router is not physical accessible and no port forwarding allowed

is there any solution like reverse SSH or something like or using ip cloud to access the Mikrotik via a VPS(Linux or Windows)?

Please Please help me, I'm in a bad situation

SLES 11 SP3 with Bootable Driver Kit - unable to fetch image error

Posted: 13 Jun 2021 06:06 PM PDT

I'm trying to install SLES 11 SP3 using Cobbler but it failed after downloading the NBP file. The error on the screen is "Unable to fetch TFTP image".

I have a similiar setup for SLES 11 SP2 and it is working fine. The difference with this setup is I am installing SLES 11 SP3 on a IBM x3500 M5 server, which requires a bootable driver kit(BDK) to be installed prior to the installation of the OS itself.

My setup is as follows:

/var/lib/tftpboot> tree uefisp3  uefisp3  ├── biostest  ├── bootx64.efi  ├── elilo.conf  ├── initrd  ├── linux  ├── memtest  ├── message  └── pxelinux.0

The initrd and linux is not from the SLES DVD, but from the BDK image.

Quoting from https://drivers.suse.com/doc/Usage/Driver_Kits.html: Copy the kernel and initrd images from the driver kit iso image to the appropriate location on your tftp boot server. The initrd and kernel image are found under the /boot/x86_64/loader directory.

I have tried changing the /etc/cobbler/dhcp.template by pointing the filename to uefisp3/bootx64.efi, pxelinux.0 and uefisp3/pxelinux.0 but none of them work.

My cobbler distro report:

Name                           : sles11sp3  Architecture                   : x86_64  Breed                          : suse  Comment                        :  Initrd                         : /driverkit/boot/x86_64/loader/initrd  Kernel                         : /driverkit/boot/x86_64/loader/linux  Kernel Options                 : {'install': 'http://192.168.0.10/sles/sles11sp3', 'addon': 'http://192.168.0.10/sles/driverkit'}  Kernel Options (Post Install)  : {}  Kickstart Metadata             : {}  Management Classes             : []  OS Version                     : sles10  Owners                         : ['admin']  Red Hat Management Key         : <<inherit>>  Red Hat Management Server      : <<inherit>>  Template Files                 : {}

My elilo.conf (to be honest I'm not even sure if I need this file, but this is how I did it with SLES 11 SP2):

/var/lib/tftpboot> cat uefisp3/elilo.conf  prompt  timeout=100  default=linux    image=linux      label=linux      description = "Installation"      initrd=initrd      append="/images/sles11sp3/initrd textmode=1 install=http://192.168.0.10/sles/sles11sp3 autoyast=http://192.168.0.10/cblr/svc/op/ks/profile/raid1drbd_sp3i addon=http://192.168.0.10/sles/driverkit"

Excerpt from pxelinux.cfg/default file:

LABEL raid1drbd_sp3          kernel /images/sles11sp3/linux          MENU LABEL raid1drbd_sp3          append initrd=/images/sles11sp3/initrd textmode=1 install=http://192.168.0.10/sles/sles11sp3 addon=http://192.168.0.10/sles/driverkit  autoyast=http://192.168.0.10/cblr/svc/op/ks/profile/raid1drbd_sp3          ipappend 2

TFTP server works, as I did try to fetch some files from the TFTP server manually. In /var/log/messages there is an error "tftp: client does not accept options", which from what I read is most likely not relevant to the issue that I'm facing now.

Anyone with success installing SLES 11 SP3 with the driver kit?

UPDATE:

Captured the following during PXE boot attempt:

PXE boot on SLES 11 SP3

2015-05-21 16:30:52.830169 IP 192.168.0.50.fj-hdnet > spacewalk.tftp:  49 RRQ "uefisp3/bootx64.efi" octet tsize 0 blksize 1468  2015-05-21 16:30:52.839093 IP 192.168.0.50.h323gatedisc > spacewalk.tftp:  41 RRQ "uefisp3/bootx64.efi" octet blksize 1468   2015-05-21 16:30:53.360209 IP 192.168.0.50.h323gatestat > spacewalk.tftp:  41 RRQ "uefisp3/bootx64.efi" octet blksize 1468   2015-05-21 16:30:53.872046 IP 192.168.0.50.h323hostcall > spacewalk.tftp:  30 RRQ "/grub.efi" octet blksize 512   2015-05-21 16:30:53.875762 IP 192.168.0.50.caicci > spacewalk.tftp:  30 RRQ "/grub.efi" octet blksize 512

Am running out of time, will do further testing tomorrow. Thanks for the idea. Brilliant!

SECOND UPDATE:

Currently PXE works as well as the auto installation. However the server is not able to boot up due to error with elilo.conf. I was not around during the installation so I am not sure what went wrong. Didn't get the chance to perform another round of installation.

Thanks.

How to get access to Icinga on Apache 2.4?

Posted: 13 Jun 2021 04:01 PM PDT

I am trying to install Icinga on a FreeBSD 9.1 box with Apache 2.4. I use the Apache config which was provided with the Icinga port.

But when i try to access the web frontend, i get the following error in my log:

AH01276: Cannot serve directory /usr/local/www/icinga/: No matching DirectoryIndex (none) found, and server-generated directory index forbidden by Options directive

I have a DirectoryIndex directive in my httpd.conf, but not in the Icinga config snippet, which uses index.html as an index. The Options directive is Options None.

When i try to specify a custom Directory Index in the Icinga config snippet, i get the following error:

Invalid command 'DirectoryIndex', perhaps misspelled or defined by a module not included in the server configuration

So Google tells me that maybe my mod_dir isn't enabled. Well, it is not in the modules list in httpd.conf where i can uncomment the modules to load, but i have a DirectoryIndex directive in my httpd.conf which is accepted by Apache.

So i am struggling to get the Icinga web frontend to work, and i was hoping that anyone can help me.

add space to virtual disk on vmware

Posted: 13 Jun 2021 07:02 PM PDT

I have a VMWare Server 3.5 system with 2 VMs. On one powered on vm I changed the disk size from 1TB to 1,5TB. But the vm didn't see any new unallocated space so that I reboot twice the server. But nothing was happened... the OS on guest is CENTOS and the two disks are LVM. fdisk see the new space... but none partitions on the disk.. lvm does not see any free space...

[root@srv-archive ~]# dmesg |grep sdb  sd 2:0:1:0: [sdb] 3145728000 512-byte logical blocks: (1.61 TB/1.46 TiB)  sd 2:0:1:0: [sdb] Write Protect is off  sd 2:0:1:0: [sdb] Mode Sense: 61 00 00 00  sd 2:0:1:0: [sdb] Cache data unavailable  sd 2:0:1:0: [sdb] Assuming drive cache: write through  sd 2:0:1:0: [sdb] Cache data unavailable  sd 2:0:1:0: [sdb] Assuming drive cache: write through   sdb: sda1 sda2  sd 2:0:1:0: [sdb] Cache data unavailable  sd 2:0:1:0: [sdb] Assuming drive cache: write through  sd 2:0:1:0: [sdb] Attached SCSI disk  dracut: Scanning devices sda2 sdb  for LVM logical volumes vg_srvarchive/lv_swap vg_srvarchive/lv_root

and fdisk

 # fdisk -l /dev/sdb    Disco /dev/sdb: 1610.6 GB, 1610612736000 byte    255 testine, 63 settori/tracce, 195812 cilindri  Unità = cilindri di 16065 * 512 = 8225280 byte  Sector size (logical/physical): 512 bytes / 512 bytes  I/O size (minimum/optimal): 512 bytes / 512 bytes  Identificativo disco: 0x00000000

and the following is the pvdisplay where u can see 0 free space:

# pvdisplay   --- Physical volume ---  PV Name               /dev/sdb  VG Name               vg_archive  PV Size               1000,00 GiB / not usable 4,00 MiB  Allocatable           yes (but full)  PE Size               4,00 MiB  Total PE              255999  Free PE               0  Allocated PE          255999  PV UUID               3Qftxe-rpff-TjTA-9CA4-BoeM-qEgc-RzSzXL

I want only expand my lvm device.. thanks very much cheers luigi

Outlook Cannot Establish Connection To Exchange while using Citrix VPN

Posted: 13 Jun 2021 04:01 PM PDT

I have several user who travel and connect to our network using Citrix VPN. Once they are connected they can connect to our servers..etc. but when opening up Outlook they are greeted with a login window. No combination of usernames will allow Outlook to connect to the Exchange Server. If the user is in the office, the connect works just fine. Note: This is only a handful of users out of hundreds. I even tried a brand new machine but the issue still remains for these people. Any thoughts?

TPROXY iptables and l7 filter

Posted: 13 Jun 2021 08:08 PM PDT

I am struggling with the TPROXY rule on mangle table, I configured this rules:

iptables -t mangle -I PREROUTING 1 -m layer7 --l7dir /etc/l7-protocols --l7proto http -p tcp --dport 80  -j TPROXY --on-port 1035 --tproxy-mark 0xffff  ip rule add fwmark 0xffff lookup 100  ip route add local 0.0.0.0/0 dev lo table 100

The http pattern reg expr contains this really simple gerexp:

.*

In this way all should match with that.

I wrote a program which open a sock_raw and print all received packets, I tested it and it's works, I am sure about that. What I see is that I cannot see the redirection caused by the TPROXY rule and in fact it redirect nothing, I think.

Have you any suggestion ?Maybe I misunderstand some iptable or l7filter rule and my problem is really simple.

Thanks a lot in advance! Pietro.

e-techbytes

Sunday, June 13, 2021

Recent Questions - Server Fault

Recent Questions - Server Fault

No comments:

Post a Comment